A 64-Channel 965-µ W Neural Recording SoC with UWB Wireless Transmission in 130-nm CMOS

— This paper presents a 64-channel neural recording system-on-chip (SoC) with a 20-Mbps wireless telemetry. Each channel of the analog front-end consists of a low-noise band-pass ampliﬁer, featuring a NEF of 3.11 with an input-referred noise of 5.6 µ V rms in a 0.001-10 kHz band, and a 31.25-kSps 6-fJ/conversion-step 10-bit SAR A/D converter. The recorded signals are multiplexed in the digital domain and transmitted via a 11.7 % -efﬁciency pulse-position-modulation UWB transmitter (TX), reaching a transmission range in excess of 7.5 m. The chip has been fabricated in 130-nm CMOS process, measures 25 mm 2 and dissipates 965 µ W from a 0.5-V supply. This SoC features the lowest power per channel (15 µ W) and the lowest energy per bit (48.2 pJ) among state-of-the-art wireless neural recording systems with a number of channels larger than 32. The proposed circuit is able to transmit the raw neural signal in a large bandwidth (up to 10 kHz) without performing any data compression or losing vital information, such as local ﬁeld potentials (LFPs).


I. INTRODUCTION
Wireless multi-channel neural recording systems are highly demanded in neuroscience experiments with laboratory animals to study the complex brain behavior.They are also critical components in Brain Machine Interface (BMI) systems, to restore motor function to amputees and patients suffering from paralysis [1].In neuroscience, their adoption improves animal freedom of movements, and reduces motion artifacts and tethering effects.In neuroprosthetics, wireless solutions are more immune to infection risk.The requirements of a large number of recording channels (64 or higher) and a wide signal bandwidth (10 kHz) translate to a high output data rate (>10 Mbps) of the transmitter.This throughput should be reached with minimum power consumption to allow the use of a battery supply in neuroscience systems and to limit tissue necrosis and allow wireless supply in neuroprosthetic devices.Existing SoC designs overcome this limitation featuring on-chip processing to narrow the required bandwidth.To this aim, the pioneering work in [2] detects and transmits just the occurrence of action potentials (APs), while the system in [3] transmits only the samples corresponding to the detected spikes, reducing the output data rate to 1.4 Mbps for 64 channels.However, spike detection results in a loss of vital information, such as Local Filed Potentials (LFPs) in the 1-300 Hz band.
Systems able to transmit wide-bandwidth raw data without resorting to data compression have been presented.The system in [4] allows the transmission of 90-Mbit/s raw data from 128 channels with a power budget of 6 mW (47 µW per channel, 1-m TX range) employing an UWB transmitter, while the work in [5] achieves an outstanding 2.6-µW power per channel, but for a restricted number of channels (4, with a maximum data rate of 1 Mbit/s) and for a limited transmission range of 1 mm, adopting a load-shift-keying (LSK) modulator.Recently, also wireless neural systems for electrocorticography (ECoG) recording from the surface of the cerebral cortex have gained popularity [6] because of a lower invasiveness and a lower data rate.However, the direct recording of APs is the only type of BMI proven to provide enough temporal and spatial resolution to control complex robotic prostheses [5].In conclusion, the trade-off between high data rate and low power consumption in wireless neural recording systems is still an open issue.To this purpose, this work aims at showing the feasibility of a very low power integrated circuit for multi-channel recording and wireless transmission of raw data (signal frequency up to 10 kHz) at high bit-rate featuring an 8-GHz UWB transmitter.The proposed proof-of-concept has been implemented in a 130-nm CMOS technology with a 0.5-V supply, targeting a power consumption lower than 1 mW for 64 channels sampled at 30 kSps with a resolution of 10 bit, which is equivalent to a 20-Mbps data rate.It features an overall power consumption of 965 µW, corresponding to 15 µW/channel and 48.2 pJ/bit, these figures-of-merit being the lowest among state-of-art wide-band wireless neural recording systems with more than 32 channels.Its functionality has been validated in benchtop tests with pre-recorded neural traces and an external patch antenna, allowing a transmission range in excess of 7.5 m.Most of the design efforts have been spent to maximize the efficiency of each single block (channel amplifier, A/D converter and UWB transmitter) while facing the severe issues that the reduction of the supply voltage imposed in terms of noise, dynamic range, power consumption and TX range.These features make the proposed circuit a valid option in the perspective of both head-mounted and implantable BMI applications, once provided with an energy harvesting solution, like inductive powering, and a miniaturized UWB antenna.  of the incoming signals.The digital data are then serialized by means of parallel in/serial out (PISO) shift registers and sent to a high efficiency UWB transmitter.All the operations are synchronized by a clock manager circuit, composed by a 80-MHz Pierce oscillator and two frequency dividers generating the clock signals for the channel converters (31.25 kHz) and for the PISO registers (20 MHz).The transmission protocol is managed by a logic circuit (sync logic in Fig. 1(a)) that adds periodically a synchronization header to the data stream in order to correctly reconstruct the recorded signals at the receiver side.A power-on-reset (POR) circuit, which provides a fast system start-up, and a bias reference circuit complete the neural recording unit.The specifications of each single block were derived by taking into account that the typical noise due to electrode and background neural activity is 10-20 µV rms in a 10-kHz band [2], [4] and that the maximum signal (both LFP and AP) is 1-2 mV, resulting in an SNR close to 40 dB.Therefore, to be conservative, the target input noise of the front-end was set to 5 µV rms , the ADC resolution to 10 bit and the overall gain variable from 40 dB to about 60 dB, corresponding to an LSB of 10 µV and 1 µV, respectively.Finally, considering that in both implantable and head-mounted applications there would be additional impairments, and to guarantee sufficient link margin, the TX was designed to achieve a transmission range significantly larger than the 3 m typically required by the envisioned applications.To meet these specifications with minimum power consumption, the supply was lowered to 0.5 V but still retaining the optimum performance of the analog blocks.Circuit solutions were combined to get a Power Efficiency Factor (PEF) of the front-end almost 2x better than the state-of-the-art [5].The ADC was designed to attain a sub-10 fJ/conversion-step efficiency, while an energy per transmitted bit better than 50 pJ/b and a TX range larger that 7 m was achieved employing an UWB transceiver.

A. Analog-front end
To cope with the low power supply, low-noise and wide swing analog circuit solutions have been investigated.The amplifying chain, as shown in Fig. 2, is a cascade of two capacitive-coupled amplifiers providing ac amplification and bandpass filtering from 1 Hz to 10 kHz, while a 10-bit ADC performs the digitization at 31.25 kSps-rate.The first stage is a low-noise amplifier (LNA), ac coupled to reject the offset of the electrode-tissue interface, featuring a gain of 40 dB set by the C IN /C F ratio.A subthreshold PMOS transistor in the feedback path sets a high-pass corner frequency to 0.1 Hz.To speed up the amplifier dc stabilization at the start-up, its gate is connected to the output of a power-on reset circuit.To improve the g m /I ratio, the operational transconductance amplifier (OTA) has been implemented with a current-reuse technique [4].Thick oxide input transistors were adopted to reduce the gate current, which could entail the stage saturation.Further power and area reduction were obtained using a selfbiased common-mode feedback network in the OTA first stage.Working at low supply, the optimization of the voltage swing is essential.To this aim, the second stage of the OTA features a class-AB output and its common mode voltage is precisely kept at half the supply by a highly linear common mode feedback network with four cross-coupled pseudo-resistors (see inset of Fig. 2).The configuration cancels non-linearities arising from differential signals, thus suppressing commonmode artifacts and preserving the available headroom.The second amplifier (PGA) provides an additional gain of 0-20 dB depending on an external control.To always keep the channel bandwidth to about 10 kHz, both the input and the Miller compensation capacitances are switched depending on the selected gain.Note that the high-pass cut-off frequency of the PGA (1 Hz) differs from the high-pass frequency of the LNA (0.1 Hz).This choice greatly reduces the input-referred 1/f 2 noise contribution due to the LNA pseudo-resistors [3].Regarding the ADC conversion, the low-power requirements were pursued by adopting a 10-bit fully-differential SAR converter with asynchronous logic, dynamic comparator and monotonic switching algorithm [7].To cope with the area limitation of each channel, a binary weighted with attenuation (BWA) capacitor DAC was chosen since it is the only topology that makes it possible to adopt standard highly-matched 34 fF MiM capacitors, instead of sub-fF full-custom capacitors, for the same total array capacitance of 4.28 pF [7].Two bootstrapped switches sample the incoming signal directly at the comparator inputs, allowing the ADC to operate at 0.5 Vsupply without linearity degradation.The 64 channels are sampled at 31.25 kSps and the output bits stored in 64 PISO registers.The digitized data are then serialized at a rate of 20 MHz (64 ch.×10 bit/ch.×31.25 kHz) and then sent to the transmitter (TX).The time-division multiplexing (TDM) in the digital domain avoids the use of power-hungry line-buffers with a sequential turn-on procedure [4], which can lead to channel cross-talk.

B. UWB transmitter
The TX adopts an Impulse-Radio UWB architecture [8] with a pulse-position modulation (PPM) and operates in the 7.25-8.5GHz unlicensed frequency band for UWB communications in Europe, USA and Japan and far from the WiFi and cellular blockers.The transmission occurs in packets formed by a 640-bit synchronization header and a data payload, whose length can be up to 1024×640 bits resulting in a negligible overhead.Short-pulses are generated by turning on for few ns an 8-GHz digitally-controlled oscillator (DCO) (see Fig. 3(a)).This is implemented as an LC-tank oscillator with a NMOS differential pair, which can be tuned thanks to a 4-bit bank of binary weighted MoM capacitances.The DCO operates in a voltage-limited regime with an oscillation amplitude close to 2V DD =1 V.The tank inductor is directly coupled to a second inductor.This transformer allows to drive the 50-Ω antenna enhancing its resistance by a factor of 4.
The duration of the pulse, which establishes the bandwidth of the output spectrum, is set by a counter (see Fig. 1(a)) implemented as the cascade of three True Single Phase Clock (TSPC) registers [8].Due to the high operation frequency, the counter is powered by a 1.2 V-supply generated inside the chip by a fully-integrated charge-pump (CP) clocked at 20 MHz.A simplified schematic of the charge-pump is shown in Fig. 4. A switched-capacitor topology was adopted with two 10-pF flying capacitors (C 1 and C 2 ) and a storage capacitor, C S =24 pF.An auxiliary CP with flying capacitor value scaled down by 10x and no storage capacitor is used to turn the switch M s fully on.The CP efficiency, estimated by transistor-level simulations, is 75 %, with 84 µW drawn from the 0.5-V supply.
The PPM modulation is accomplished by the TX control unit that enables the pulse generation on the first or on the second rising edge of the 80-MHz clock occurring within the symbol period.

III. EXPERIMENTAL RESULTS
The 64-channel neural recording SoC has been fabricated in a standard 130-nm CMOS.It occupies an area of 25 mm 2 , including pads, and its overall power consumption is 965 µW from 0.5-V supply.Fig. 5 shows the measured results related to the recording channel.The full chain has a digitally-controlled gain between 40 and 58 dB and an input referred-noise of 5.6 µV rms , in accordance to the system specs.The passband ranges from 1 Hz to 10.5 kHz enabling to capture both LFPs and neural spikes.Fig. 6 shows the typical measured linearity performance of the channel ADC.A stand-alone converter has been implemented on the die and fed by an external test signal to measure its dynamic and static linearity metrics.The ADC achieves an ENoB of 8.45 bit, a DNL<0.61LSB, an INL<2 LSB and a 52.6-dBSNDR for an efficiency of 6 fJ/conversion-step, in line with the performance of state-of-the-art ADCs even if implemented in a less scaled technology.Each channel dissipates about 1 µW (0.93 µW and 70 nW for the amplifier and for the ADC, respectively), yielding a channel noise efficiency factor (NEF) of 3.11 and a power efficiency factor (PEF) of 4.84, the latter parameter defined as: This work achieves the best PEF, improving the 9.42 result in [5] by 2x.The total power consumption of the analog front- end, including clock and reference circuits, is 495 µW.Fig. 3(b) shows a measured TX pulse.The TX spectrum achieves a -10-dB band of 1.1 GHz around 8 GHz and is compliant to the UWB spectral mask, which limits the power spectral density to -41.3 dBm/MHz.The TX power consumption at the nominal 20-Mb/s bit-rate is 470 µW, mostly due to the DCO (350 µW).This corresponds to an energy consumption of 23.5 pJ/bit that, together with the 2.76 pJ per pulse delivered to the antenna, results in an efficiency of 11.7 %, which is the best reported so far among fully-integrated UWB transmitters.A wireless communication test was performed using a receiver (not part of this SoC) similar to that presented in [11] whose sensitivity is around 1 aJ per pulse and a patch antenna made of a disc monopole with L-shaped ground plane (featuring a size of 5 cm × 5 cm).A 7.5-m TX range at BER=10 −3 was achieved, as shown in Fig. 7.The UWB transmitter accounts for 49 % of the overall system power consumption, while the 64 acquisition channels have a negligible impact (6 %).The remaining power consumption is mainly due to circuit needed to generate the synchronization signals.In particular, 18 % of the power is due the 80-MHz reference clock circuit, while the digital circuit delivering the reference clock to the converters and PISO registers accounts for about 22 % of the power.Although most of the circuit blocks were individually optimized and their performance checked out, the main challenge for these systems is to retain consistent performance while the entire system is operating.Therefore, to verify functional and robust operation, a full system benchtop test was performed.In this test, pre-recorded biopotential signals were applied through a signal generator at one input of the system setting the amplifier gain at the maximum value (58 dB) and placing the receiver at a distance of 3 m.This distance has been chosen as a reasonable value in the perspective of both a head-mounted and an implantable application.Fig. 8 shows one reconstructed waveform after demodulating the wireless transmitted data.The comparison between the original trace and the corresponding received waveform shows an excellent quality of the data acquisition and the wireless link.In a second test, the inputs were short-circuited to ground and the output signals (i.e.noise) acquired wirelessly to measure the input noise while the whole system was operating.The equivalent input noise is close to 9 µV rms and its increase is mainly due to digital and substrate noise.This value is comparable to other previously published results [4], [5] and it is low enough for neural signal acquisitions.In Table I the proposed SoC is compared to other wireless neural recording systems.The implemented device features a power per channel of 15 µW, preserving the quality of the transmitted signal (the raw signal is acquired and transmitted without data compression as in [2], [3] or bandwidth reduction as in [9]).Only the systems in [5] and [6] feature a lower power per channel of 2.6 µW and 3.5 µW, respectively.Indeed, the work in [5] accomplishes this outstanding result with only 4 channels for a limited 1-Mbps data rate, while the system in [6] has been designed for ECoG applications, sampling each channel at 1 kSps.However, systems with different signal bandwidths and sampling rates can be compared simply computing the system energy per bit, dividing the overall power consumption by the bit-rate as in Table I.The proposed SoC features an energy per bit of 48.2 pJ outperforming the device in [6] (225 pJ/bit).
IV. CONCLUSIONS This paper presents a 64-channel 0.5-V supply neural recording system-on-chip with 20-Mbps wireless telemetry.The system is able to transmit the recorded neural data with a power per channel of 15 µW and 48.2 pJ per transmitted bit, which represent the lowest figures-of-merit among multichannel wireless neural recording systems.These results have been achieved without compromising the signal quality, since 10 kHz-band raw data are acquired and transmitted, or the TX range.These features make the presented SoC a viable option for an envisioned chronically implantable brain activity monitoring device as well as for a head-mounted system to be used in neuroscience experiments with laboratory animals.

Fig. 2 .
Fig. 2. Detailed schematic of the recording channel with the 2-stage amplifier and the 10-bit binary-weighted with attenuation capacitor SAR ADC.

Fig. 3 .Fig. 4 .
Fig. 3. Simplified schematic of the DCO with the transformer driving the antenna (a) and measured pulse waveform (b).

Fig. 8 .
Fig. 8. (a) Neural trace transmitted by the wireless link and (b) comparison between an original and a reconstructed spike.
only one channel is sampled by a 10-bit ADC, for the remaining channels only spike detection is performed †