Eliza Strickland: Paralysis used to be thought of as a permanent condition, but over the past two decades, engineers have begun to find workarounds. They’re building on a new understanding of the electric code used by the nervous system. I’m Eliza Strickland, a guest host for IEEE Spectrum’s Fixing the Future podcast. Today I’m talking with Chad Bouton, who’s at the forefront of this electrifying field of research. Chad, welcome to the program, and can you please introduce yourself to our listeners?
Chad Bouton: Yes, thanks so much, Eliza, for having me. And my name is Chad. I’m at the Northwell Health Feinstein Institute for Medical research.
Strickland: And can you tell me a bit about the patient population that you’re working with? I believe these are people who had become paralyzed, and maybe you can tell us how that happened and the extent of their paralysis.
Bouton: Absolutely. Absolutely. In fact, we work with folks that have been paralyzed either from a traumatic injury, stroke, or even a brain injury. And there’s over 100 million people worldwide that are living with paralysis. And so it’s a very devastating and important condition, and we are working to restore not only movement, but we’re making efforts to restore sensation as well, which is often not the focus and certainly should be.
Strickland: So these are people who typically don’t have much movement below the head, below the neck?
Bouton: So we have focused on tetraplegia or quadriplegia because, obviously, it’s extremely important and it is very difficult to achieve independence in our daily lives if you don’t have the use of your hands in addition to not being able to move around and walk. And it surprisingly accounts for about half of the cases of spinal cord injury, even slightly more than half. And it used to be thought of as something that was a more rare condition, but with car accidents and diving accidents, it’s a prominent and critical condition that we need to really address. And there’s no cure currently for paralysis. No easy solution. No simple fix at this point.
Strickland: And from your experiences working with these people, what kind of capabilities would they like to get back if possible?
Bouton: Well, individuals with paralysis would like to really regain independence. I’ve had patients and study participants comment on that and really ask for advances in technology that would give them that independence. I’ll speak to some of the things we’re doing in the lab, but folks often ask, “Could we take this home or take it outside the lab?” And we’re certainly working to do that as well. But the goal is to be more independent, ask for help less, be able to achieve functional abilities to do even things that we might consider just basic necessities, feeding, grooming, and even some of the personal aspects, being able to hold someone’s hand and to feel that person’s hand or a loved one’s hand. Those are the things that we’re really targeting and working hard to address.
Strickland: Yeah, I thought it’s really interesting that your group is focused on hands. There are other groups that are working on letting people walk again, but the hands feel like a very obviously important target too.
Bouton: Yeah, absolutely. And in fact, there’s been studies and widespread surveys on this topic, and folks that are living with tetraplegia or quadriplegia prioritize or say their top desire is to move their hands again. And if you step back and think about it for a second, it makes sense because we rely on our hands so much. And even losing one hand, say from a stroke, can be devastating and very disruptive to our lives.
Strickland: Yeah, let’s go over the basics of electrophysiology for listeners who don’t have a background in that area. I love this field. It has such a long history that goes back to the 1780s when Luigi Galvani touched an exposed nerve of a dead frog with a scalpel that had an electric charge and saw the frog’s leg kick.
Strickland: Can you explain how the nervous system uses electricity?
Bouton: Yes, absolutely. So it’s an electrochemical phenomenon. And of course, it involves neurotransmitters as well. When a neuron fires, as we say, that’s an electrical impulse. It only lasts a very brief moment, less than a thousandths of a second. But basically, there’s a polarization of the neuron itself and charges that are passing through ion channels. So what does this mean? Well, it’s kind of like in a computer where you have zeros and ones. For a brief moment, that cell has changed from, let’s say, a zero to a one, and it is firing or having this impulse that represents that binary one. And what’s so neat about it is that the firing rate, so basically how often those impulses are happening or how fast they’re happening, carries information. And then, of course, which neurons or nerve fibers carry the information or which ones are firing is what we call spatial encoding. So you have temporal encoding and spatial encoding. Those together can carry a tremendous amount of information or can mean different things, whether it’s a motor event where there’s a need to activate certain muscles in the hand or the fingers or the legs and any muscle throughout the body. And we also have sensory information that gets encoded by the same approach. And so information can pass from the brain to the body and from the body back to the brain, and we have these two-way information highways all throughout our central and peripheral nervous system. I call it often the most complex control system in nature, and we’re still trying to understand it.
Strickland: Yeah, so for a person with tetraplegia, these electrical messages from the brain are essentially not getting through. The highway is blocked, right?
Bouton: That’s right. Absolutely. And so let’s walk through that scenario. So now, someone who’s had a car accident or a diving accident, often the highest level of stress occurs at the base of the neck, and we call that C5, so it’s the cervical, a fifth vertebra there. Often that cord gets damaged because the vertebra itself, which normally would protect that cord, unfortunately, it gets fractured and can then slip or slide and can actually crush or damage the cord itself. So then what is often misunderstood is that you don’t get a simple complete shutdown. You get damage and certain levels of damage or amounts of damage. And what can happen is someone can become paralyzed but lose sensation as well along with motor capability. It’s not going to be the same for everyone. There’s different levels of it. But usually, there’s damage, and signals are able to get through but often very attenuated, very weak. And so I’ll talk through some of the approaches we’re taking now to boost, if you will, those signals and try to enhance those signals. The good news is that we’re finding more and more that those signals are there and can be boosted or enhanced, which is very, very exciting because it’s opening new doors to new therapies that we’re developing.
Strickland: Yeah, I love that you call your system the neural bypass, which is very evocative. You can imagine picking up the signals in the brain, getting around the blockage, and sending the information onto the muscles. So maybe we can talk about the first part of that first. How do you get the information from the brain?
Bouton: Well, yes, the neural bypass, so it’s funny because that phrase was used very briefly back in the ‘70s. And then it kind of went away and I think really because it wasn’t possible with technology at that time. But then in the early 2000s, we started to really explore this concept and use that phrase again and say, if we can put a microelectrode array in the brain, which we did back around 2005, 2006, and a number of colleagues and various team members kind of looked at that and said, yes, we can record from the brain. We can even stimulate the brain. But we said, why couldn’t we take that information, reroute it, as you say, around an injury or even a damaged part of the nervous system or the brain itself and create this neural bypass, and then reinsert the signals or link those signals directly to muscle stimulation? And that was what we called the one-way bypass, neural bypass. And why couldn’t we do that and restore movement? And so we attempted to do that and were thankfully successful in 2014. In fact, we had enrolled a young man named Ian Burkhart. His name, of course, became public, and he was the first paralyzed individual to regain movement using a brain implant that formed this neural bypass, this one-way or unidirectional neural bypass. And it was very, very exciting, and he was able to do some pretty amazing things with this approach. And in fact, I still remember when he first drank from a glass on his own. He reached out, opened his fingers using the bypass, which he hadn’t been able to do for four years since his accident, and he was able to open his hand by himself without help, pick up a glass, bring it to his lips, and be able to just take a drink. It was really quite a moment, and the entire team and myself were very moved and we thought we’re really taking an important step forward here.
Strickland: Ian Burkhart also played Guitar Hero if I remember right. Is that correct?
Bouton: Yeah, so another very, very exciting moment was when we explored the idea of rhythmic movements in the hand. So I’ll do a little experiment here. We’ll do it even though this is a podcast, but we can all do this experiment. If you hold up one hand-- and you should try this, Eliza. Okay, so hold up, say, either left or right. Now take your other hand and drum your fingers against the palm of your hand and go very, very fast. Okay, now stop, and now try to reverse directions. Okay. And is it awkward and harder? Okay, so now pay attention to which way was the fastest, what we would call, quote, “natural” way for you. Was it pinky to index or index to pinky?
Strickland: Pinky to index was really easy for me. The other way was almost impossible.
Bouton: Okay, well, you’re what we call the normal group. So the 85 percent of population does the faster, more natural direction from pinky to index. Only 15 percent of the population goes from index to pinky. And the question is, why in the world is there a wiring, if you will, or a natural direction? And we looked at rhythmic movements. As we looked at the electrode array and the signals we were recording, we could see there was a group or an ensemble of neurons that were firing when we are thinking about rhythmic movements, say just wiggling a finger. And then the other, there’s a totally different group when you actually try to do a static movement of that finger. You’re trying to press it and hold that finger in a certain position. So we thought, let’s see if we can decipher these different groups. And then we linked those signals back to neuromuscular stimulators that we had developed, and we then asked the question, could Ian or others move the fingers in a more dynamic way? And we published another paper on this, but he was able to dynamically move his fingers and then also statically move those, and he could then play Guitar Hero just by thinking about different static or sustained movements and holding a note, let’s say, in the guitar or dynamically doing riffs. And we have videos and whatnot online. But it was really amazing to deepen our understanding but also to allow, again, a little more independence, allow someone to do something fun, a little bit more recreational too.
Strickland: Sure, sure. So Ian was using implanted electrodes to get his brain signals. Can you walk us through the different approaches in plants versus wearables?
Bouton: Yes, actually, there are a number of ways of tapping into the nervous system and specifically into the brain. And a more recent approach we’ve been taking is to use a minimally invasive procedure to place a very thin electrode. It’s called a stereo electroencephalogram-type electrode, an SEEG. And these are used routinely at our location and a number of locations around the world for mapping the brain in epilepsy patients. But now we ask the question, well, could we use these electrodes to record and stimulate in the motor and sensory area? And we just recently this past year did both, and our findings were quite striking. We were able to not only decode individual finger movements with this different type of electrode and approach, but we were also able to stimulate in primary sensory cortex actually down in the central sulcus. That’s right between your motor and sensory area. And on the wall of the sulcus on the sensory side, we were able to stimulate and elicit highly focal percepts at the fingertips. And this has been a challenge with different electrodes, like the kind of electrodes that I was previously talking about, which were placed on the surface of the brain, not down into the sulcus. So this has allowed us to answer new questions and is also opening up a door to a minimally invasive approach that could be extremely effective in trying to restore even finer movements of the human hand and also sensations. You have to know that you can’t button your shirt without tactile feedback, and getting that feedback at the tips of the fingers is so important for fine motor tasks and dexterous hand movement, which is one of the goals of our lab and center.
Strickland: Yeah, I wanted to ask about this idea of the two-way bypass. So in this idea, you have sensors on your fingers or on your hand, and those are sending information to electrodes that are conveying it to the brain?
Bouton: That’s absolutely right. With the fingertips and the thin membrane sensors that we’ve developed, we can pick up not only the pressure level that the fingertips but also even directional information. So in other words, when we pick up, say, a cup, I have one here on my desk, and I’m picking this cup up. There’s a downward, what we call shear force that’s pushing the skin down towards the floor. And this is additional information the brain receives so that we know, oh, we’re picking something up that has some weight to it. And you don’t even realize you’re doing this, but there’s a circuit, a relatively complex circuit that involves interneurons in the spinal cord that tightens that grip naturally. You don’t, again, realize you’re doing it. Just a little subtle increase in your grasp. And so when we want to create a bidirectional or a two-way neural bypass, we have to use that information from the sensors, we have to route that back into our computer, we have to decode or decipher that information. That part is straightforward from the sensors, but then how do you encode that information so the brain will interpret that as, oh, I feel not only some kind of sensation at my fingertips, but what’s the level of that sensation?
And we just, again, last year, were able to show that we can encode the different levels of pressure or force felt, and the participants have reported very accurately what those levels are. And then once the computer understands and interprets that and then starts to send signals back to another set of what we call microstimulators that stimulates the brain, again, with the right firing rate or frequency, then the challenge still remains to make that feel natural. Right now, people still report it’s a bit of a slightly artificial sensation sometimes, or they feel like, I feel this pressure in different levels, but it’s a little bit electrical or even mechanical like a vibration. But it is still extremely useful, and we’re still refining that. But now what you’ve done is you’ve started to close the loop, right? Not only can signals from the brain be interpreted and sent to stimulation devices for muscle activation, we can also pick up the sensation, the tactile sensation, send it back into the brain, and now we have a fully closed loop or a bidirectional bypass.
Strickland: So when you’re sending commands to muscles to have the hand do some movement, how much do we understand the neural code that makes one finger move versus another one?
Bouton: Yeah, that’s a great question. So we surprisingly understand a fair amount on that after many years and many groups looking at this. We now understand that we can change the firing rate, and we can change how fast we’re stimulating or how fast we need to stimulate that muscle to get a certain contraction level. Recording this signal, understanding the signal from the motor cortex in the brain and how that translates to a different level of contraction, we also understand much better now. Even understanding if it should be a static movement or a dynamic movement, I spoke a little bit to that. I think what’s hard, that we’re still trying to understand, is synergistic movements, when you want to activate multiple fingers together and do a pinch grasp or you want to do something more intricate. There have been studies where people have tried to understand the signal when someone flips a quarter between the fingers, you’ve seen this trick, or a drum stick when you’re spinning it around and manipulating it and transferring it from one pair of fingers to another. Those super complex movements involve motor and sensory networks working together very, very, very closely. And so even if you’re, say, listening in or eavesdropping in on the motor cortex, you only really have half the picture. You only have half the story.
And so one of the things we’re going to be looking at, and we now have FDA clearance to do this, is to record in both motor and sensory and then to be able to stimulate in the sensory area of the brain. But by recording in both motor and sensory, we can start to look more deeply into this question of, well, how are those networks communicating with each other? How do we further decode or decipher that information? I have someone in my lab, Dr. Sadegh Ebrahimi, who did his graduate work at Stanford and his postdoc work there, he looked at the question of how do different areas of the brain communicate and pass these massive amounts of information back and forth, and how are they connected, and how does this information flow? He is going to be looking at that question along with, can we use reinforcement learning techniques to further refine our decoding and more importantly our encoding and how we stimulate and how we even stimulate the muscles and get all of these networks working together?
Strickland: And for the electrodes that are controlling movement, are those a wearable system that people can just have on their arm?
Bouton: Yes, we’re very excited to announce that we’re now developing wearable versions of the neuromuscular stimulation technology, and our hopes are to make this available outside the lab in the next year or two. What we have done is we’ve developed very thin, flexible electrode arrays that have been ruggedized and encapsulated in a silicone material. And there are literally over 200 electrodes now that we have in these patches, and they’re able to precisely stimulate different muscles. But what’s so fascinating is that by using the right electrical waveforms, and we have been optimizing these for a number of years, but in the right electrode array design, turns out we can isolate individual finger movements very accurately. We can even get the pinky to move in very unique ways and the thumb in multiple directions. And with this approach and it also being wireless, people can, with this being lightweight and thin, they can actually wear it under their clothes and folks can use it out and about, outside the lab, in their homes. And so we’re really looking forward to accelerating this.
And you can link this wearable technology either to a brain-computer interface, which is what we’ve been talking a lot about, or there’s even a stand-alone mode where it uses the inertial sensing of what we call body language or basically body movements. These would be the residual movements that individuals are able to do even after their injury. It might be shoulder movement or lifting their arm. Often, in a C5-level injury, the biceps are spared, thankfully, and one can lift their arm and lift their shoulders. So folks can reach, but they can’t open and use their hand. But with this technology, we infer what they want to do. If they’re reaching for a cup of water, we can infer, ah, they’re reaching with a certain trajectory, and we use our machine learning or AI algorithms to detect, even before the hand gets to the target, we know, ah, they’re trying to reach and do what we call a power grasp or a cylindrical grasp. And we start to stimulate the muscles to help them finish that movement that they can’t otherwise do on their own. And this will not allow, say, playing Guitar Hero, but it is allowing folks to do very basic types of actions like picking up a cup or feeding themselves. We have a video of someone picking up a granola bar and a participant that fed himself for the first time. And that was also really an incredible moment because really achieving that independence is what we’re trying to do at the end of the day.
Strickland: Yeah, let’s talk a little bit about commercialization. I imagine it’s a very different story when you’re talking about brain implants versus noninvasive devices. So where are you in that pathway?
Bouton: Yeah, so you’re absolutely right. There’s a big difference between those two pathways. I spent many years commercializing technologies. And when you take them out of the lab and try to get through what we call the valley of death, it’s a tough road. And so what we decided to do is carve out the technology from the lab that was more mature and had a more direct regulatory path. We have been working closely with the FDA on this. We formed a company called Neuvotion, and Neuvotion is solely focused on taking the noninvasive versions of the technology and making those available to users and those that really can benefit from this technology. But the brain-computer interface itself is going to take a little bit longer in terms of the regulatory pathway. Thankfully, the FDA has now issued as of last year a guidance document, which is always a first step and a very important step, available. And this is a moment in time where it is no longer a question of whether we will have brain-computer interfaces for patients, but it’s now just a question of when.
Strickland: Before we wrap up, I wanted to ask you about another very different approach to helping people with tetraplegia. So some researchers are using brain-computer interface technology to read out intentions from the brain, but then sending those messages to robotic limbs instead of the person’s own limbs. Can you talk about the tradeoffs, the challenges, and the advantages of each approach?
Bouton: Absolutely. So the idea of using a brain-computer interface to interface with a robotic arm was and is an important step forward in understanding the nervous system and movement and even sensation. But the comment I heard from a number of participants through the years is that at the end of the day, they would like to be able to move their own arm and feel, of course, with their own hands. And so we have really been focused on that problem. However, it does bring in some additional challenges. Not only is a biological arm more complex and more difficult to control and you have fatigue, muscle fatigue, and things like this to deal with, but also, there’s another complication in the brain. So when we reach out for something, we pick up a cup, I talked earlier about the nervous system reacts to the weight of the cup and different things happen. Well, there’s another issue, too, when you stimulate in the sensory area and you cause a percept. Someone says, “Okay, I feel kind of pressure on my fingertips.” Well, the sensory cortex is right next door to the motor cortex primary, S1 and M1 as they’re called. And so you have all these interconnections, a huge number of interconnections.
And so we hypothesize and we have some evidence already on this is that when you stimulate and you start to encode and put information or you’re writing into the brain, if you will, well, guess what? When you’re on the read side and you’re reading from the motor cortex, because of all those interconnections, you’re going to cause changes in what we call modulation. You’re going to see changes in patterns. This is going to make the decoding algorithms more difficult to architect. We predicted this would happen when Ian became the first person to move their hand and to be able to pronate his arm. We predicted that during the transfer of objects, there might be difficulties and changes in the modulation and would affect the decoding algorithms. And indeed that did happen. So we believe as we close the loop on this bidirectional neural bypass, we’re going to run into similar challenges and changes in modulation, and we’re going to have to adapt to that. So we’re also working on adaptive decoding. And there’s been some great work in this area, but with actually reanimating or enabling movement and sensation in the human arm itself and the human hand itself, we believe we’re in for some additional challenges. But we’re up for it, and we are very excited to move into that space of this year.
Strickland: Well, Chad, thank you so much for joining us on the Fixing the Future podcast. I really appreciate your time today.
Bouton: Absolutely. Glad to do it, and thanks so much for talking with me.
Strickland: Today on Fixing the Future, we were talking with Chad Bouton about a neural bypass to help people with paralysis move again. I’m Eliza Strickland for IEEE Spectrum, and I hope you’ll join us next time.
Eliza Strickland: Technology to combat climate change got a big boost this year when the US Congress passed the Inflation Reduction Act, which authorized more than 390 billion for spending on clean energy and climate change. One of the big winners was a technology called carbon capture and storage. I’m Eliza Strickland, a guest host for IEEE Spectrum‘s Fixing the Future podcast. Today, I’m speaking with Philip Witte of Microsoft Research who’s going to tell us about how artificial intelligence and machine learning are helping out this technology. Philip, thanks so much for joining us on the program.
Philip Witte: Hi, Eliza, I’m glad to be here.
Strickland: Can you just briefly tell us what you do at Microsoft Research, tell us a little bit about your position there?
Witte: Sure. So I’m a researcher at Microsoft Research, and I’m working on scientific machine learning in a broader sense and high-performance computing in the cloud. And specifically, how do you apply recent advances in machine learning in the HPC to carbon capture? And I’m part of a group at Microsoft that’s called Research for Industry, and we’re overall part of Microsoft Research, but we’re specifically focusing on transferring technology and computer science to solving industry problems.
Strickland: And how did you start working in this area? Why did you think there might be real benefits of applying artificial intelligence to this tricky technology?
Witte: So I was actually pretty interested in this topic for a couple years now, and then really started diving deeper into it maybe a year-and-a-half ago when Microsoft had signed a memorandum of understanding with one of the big CCS projects that is called Northern Lights. So Microsoft and them signed a contract to explore possibilities of how Microsoft can support the Northern Lights project as a technology partner.
Strickland: So we’ll get into some of these super tech details in a little bit. But before we get to those, let’s do a little basic tutorial on the climate science here. How and where can carbon dioxide be meaningfully captured, and how can it be stored, and where?
Witte: So I think it’s worth pointing out that there are kind of two main technologies around carbon capture, and one is called direct air capture, where you capture CO2 directly from ambient air. And the second one is what’s usually referred to as CCS carbon capture and storage, is more carbon capture in an industrial setting where you extract or capture CO2 from industrial flue gases. And the big difference is that in direct air capture, where you’re capturing CO2 directly from the air, the CO2 content is very low in the ambient air. It’s about 0.04 percent overall. So the big challenge of direct air capture is that you have to process a lot of air to capture a given amount of CO2. But you are actively reducing the overall amount of CO2 in the air, which is why it’s also referred to as a negative emission technology. And then on the other hand, if you have some CCS, where you extracting CO2 from industrial flue gases, the advantage there is that the CO2 content is much higher in these flue gases. It’s about a 3 to 20 percent. So by processing the same amount of air using CCS, you can extract, overall, much more CO2 from the atmosphere, or more accurately, prevent CO2 from entering the atmosphere in the first place. So this is basically to distinguish between direct air capture and CCS.
And then for the actual capture part of the CCS, there’s a bunch of different technologies so you can do that. And they are typically grouped into pre-combustion, post-combustion, and oxy-combustion. But the most popular one that’s mostly used in practice right now is a post-combustion process called the amine process, where essentially, we have your exhaust from factories that has very high CO2 content, and you bring it in contact with a liquid that has this amine chemical that binds the CO2, that you basically suck the CO2 out of the air. And now you have a liquid, this amine liquid with a high CO2 concentration. And because you want to be able to reuse this chemical that binds the CO2, there has to be a second step in which you now separate the CO2 from this amine. And this is actually where now you have to spend most of your energy because now you have to reheat this mixture to separate the CO2 and get a very high content CO2 stream out that you can then store, and then you can reuse the amine. So you have to invest a lot of energy and bring it up to temperature. I think it’s about 250 to 300 degrees Fahrenheit. And once you have extracted the CO2, you have to compress the CO2 so that you can store it in the next step.
And then in between the capture and the storage, you have, of course, the transportation, because usually you have to transport it from wherever you captured it to where you can store it. The most common ways to transport the CO2 is either in pipelines or in vessels. And then in the final step, when we actually want to store CO2, there’s different possibilities for a storage that has been explored in the past. So people that have looked even at storing CO2 at the bottom of the ocean, which we kind of moved away from that idea now. I don’t think anybody’s really considering that anymore. People have also looked at storing CO2 in old mineshafts, and the approaches that are most seriously looked at now, or already used in practice, actually, is storing CO2 in old oil and gas-depleted reservoirs or in deep saltwater aquifers that are a couple kilometers below the surface. The important factors when you look at storage sites and where should I source CO2 is that, first of all, you have to have a large enough volume so that it’s very impactful that you can store enough CO2 there. Obviously, it has to be safe. Once you store the CO2 there, you’d want to make sure that it actually stays where you injected it. And then just as important as also the cost factor, if you can not store it cost-effectively, then it’s just not going to be used in practice. So like I said, this depleted oil and gas reservoirs in these deep-water saline aquifers are right now the storage sites that pretty much satisfy these three requirements.
Strickland: And as I understand it, carbon capture and storage is looked on as a useful technology for this transition because it can help society move away from fossil fuels like power plants that run on gas and coal and factories that use fossil fuels. Those sort of entities can keep going for a little while, but if we can capture their emissions, then they’re not adding to our climate change problem. Is that how you think about it?
Witte: I think so. There’s a few areas like, for example, the power grid, that we have a good understanding of how we can actually decarbonize it. Because a lot of it now is still using coal and natural gas, but we have kind of a path towards carbon-neutral energy using nuclear power plants, renewable energies, of course. But then there’s other areas where the answer is maybe not that obvious. For example, you release a lot of CO2 and steel production or petrochemical production or cement, construction. So all these areas where we don’t really have a very good alternative at the moment, you could make that carbon neutral or carbon negative by using CCS technology. And then I guess also why CCS is considered one of the main options is just because it’s very mature in terms of technology because the underlying technology behind carbon capture itself and CCS dates back actually to the 1930s where they developed this process that I just described, but it captured the CO2. And then as part of other industrial processes, has been used extensively since the 1970s. That’s why we have this whole network of pipelines that you could use to transport CO2. So I mean, in terms of technology, we have a really good understanding of how CCS works. That’s why a lot of people are looking at this as one possible technology. But of course, it’s not going to solve all the problems. There’s no silver bullet, really. So eventually, it has to just be part of a whole bigger package for climate change mitigation.
And it’s going to have to be part of the package at pretty enormous scale, right? What volume of carbon could we be potentially storing below ground in decades to come?
I have some numbers that I got from listening to a talk from a Philip Ringrose, who is one of the leading CCS experts. Roughly, we are releasing about 40 gigatons of CO2 into the atmosphere every year worldwide. And then one of the first commercial CCS projects that is currently being deployed is the Northern Lights project. And at the Northern Lights project, they’re looking at storing about 1.5 megatons initially, and then 3.5 tons at a later stage. So if you take these numbers and you look at the overall global release of CO2, you would have to have roughly 10,000-ish Northern Lights projects, 10,000 to 20,000 CO2 injection wells. So if you hear that, you might think, “Wow, that’s really a lot. 10 to 20,000 projects. I mean, how would we ever be able to do that?” But I think you really need to put that into perspective as well. Just looking, for example, how many wells we have for oil and gas production just in the US alone, I think in 2014, it was roughly 1 million active wells for oil and gas exploration, and only in that year alone, they drilled an additional 33,000 new wells, only in 2014. So in that perspective, 10 to the 20,000 wells, only for CCS, doesn’t sound that bad, is actually quite doable. But you’re not going to be able to capture all the CO2 emissions only with CCS. It’s just going to be part of it.
Strickland: So how can artificial intelligence systems be helpful in this mammoth undertaking? Are you working on simulating how the carbon dioxide flows beneath the surface or trying to find the best spots to put it?
Witte: Overall, you can apply AI to all the different three main components of CCS, the capture part, the transport part, whereas I’m focusing mainly on the storage part and the monitoring. So for that, there’s essentially three main questions that you have to answer before you can do anything. Where can I store the CO2? How much CO2 can I store, and how much can it inject at a time? And then is it safe and can I do a cost-efficiently? In order to answer these questions, what you have to do is you have to run these so-called reservoir simulations, where you have a numerical simulator that predicts how the CO2 behaves during injection and after injection. And the challenge of these reservoir simulations is that, first of all, it’s computationally very expensive. So it’s these big simulations that run on high-performance computing clusters for many hours or days, even. And then the second real big challenge is that you have to have a model of what the earth looks like so that you can simulate it. So specifically for reservoir simulation, you have to know what the permeability is like, what the porosity is like, how the different geological layers look like. And obviously, you can’t directly look into the subsurface. So the only information that we do have is from drilling wells, which usually in CCS projects, you don’t have very many wells, so that might only be one or two wells.
And then the second information comes from basically remote sensing, something like seismic imaging, where you get an image of the subsurface, but it’s not super accurate. But then using this very sporadic data from wells and seismic data and some additional ones, you build up this model of what this subsurface might look like, and then you can run your simulation. And the simulation is very accurate in the sense that if you give it a model, it’s going to give you a very accurate answer of what happens for that model. But like I said, the problem is that model is very inaccurate. So over time, you have to adjust that model and kind of tweak the different inputs so that it actually explains what’s really happening in practice. So one of the big challenges there is that you want to be able to run a lot of these simulations with always changing the input a little bit to see if you get the answer that you would expect.
So where we see the role of AI helping out is, on the one hand, providing a way to simulate much faster than with conventional methods, because like I said, the conventional methods, they’re very generic, but oftentimes, I sort of have an idea of what this subsurface looks like. I only want to tweak it a little bit here and there, which is where we think that AI might be helpful. Because you have a lot of data from just running the simulations, and now you can use that simulated data to train a surrogate model for that simulator. And you might be able to evaluate that surrogate model much, much faster, and then use it in downstream applications like optimization or uncertain quantification to eventually answer these three questions that I initially mentioned.
Strickland: So you’re talking about using simulated data to train the model. How then do you check it against reality if you’re starting with simulated data?
Witte: So the simulated data, you would still have to do the same process of matching the simulated data to the data that you measure when you’re out in the field. For example, in the CCS project, the CO2 injection wells has all kinds of measurements at the bottom that measures, for example, pressure, temperature, and then you have these seismic surveys that you run during injection and after injection, and then you can get an image, for example, of where the CO2 is after you inject it. So you have a rough idea of where the CO2 plume is, and now you can run your simulations, and again, change the inputs that the CO2 plume that you simulate actually matches the one that you observe in the seismic data or matches the information from your well logs. That’s something that’s often done by hand, which is very time-consuming. And the hope of machine learning is that you can not only make it faster, you can also maybe automate some of these things.
Strickland: You’re using a type of neural network called Fourier Neural Operators in this work, which seem to be particularly useful in physics for modeling things like fluid flows. Can you tell us a little bit about what Fourier Neural Operators are, what kind of inputs they use, and what the benefit of using them is?
Witte: Fourier Neural Operators is a kind of neural network that was designed for solving partial differential equations, and the original work was done by Anima Anandkumar, a PhD student, Zongyi Li, and I think Andrew Stuart from Caltech was also involved. And the idea is you simulate training data using a numerical simulator where you have a bunch of different inputs that could be, for example, the earth model, what does the earth look like? And then you simulator output would be how does the CO2 behave over time? You have many different inputs, and then typically, you train this in a supervised fashion where I now have thousands of training pairs. And then you would train, for example, a Fourier Operator to simulate the CO2 for a given input. And then you can use that in these downstream applications that require a lot of these simulations.
Strickland: Okay. So to bring this back to the physical world, what happens if carbon dioxide that’s injected into a subsurface aquifer or something like that doesn’t stay put? Is there a safety problem? Could it potentially cause earth tremors, or is it just that it would negate the effect of putting CO2 underground?
Witte: There’s definitely a risk. It’s not risk-free, but I initially overestimated the risks because kind of the mental picture that I had is that there’s a big, empty space in the subsurface: You inject CO2 as a gas, and then you only need the tiniest leak somewhere and the whole CO2 is going to come back out. But when you actually inject the CO2, it’s not a gas anymore because you have it under very high pressure and very high temperature, so it’s more like a liquid. It’s not an actual liquid. It’s called a supercritical state, but essentially, it’s like a liquid. Philip Ringrose said, “Think of it as olive oil.” And then the second aspect is that in the subsurface where you store it, it’s not an empty space. It’s more like a sponge, like a very porous medium that absorbs the CO2. So overall, you have these different mechanisms, chemical, and mechanical mechanisms that trap the CO2, and they’re all additive. So the one mechanism is what’s called structural trapping, because if you inject CO2, for example, in these saltwater aquifers, the CO2 rises up because it has a lower density than the salt water, and so you need a good geological seal that traps the CO2. You can kind of think of it maybe as an inverted bowl in the subsurface, where the CO2 is now going to go up, but it’s going to be trapped by the seal. So that’s called structural trapping, and that’s very important, especially during the early project phases. But yes, you have these different trapping mechanisms that are additive, which generally, I mean, even if you would have a leak, the CO2 would not all come out at the same time. It would be very, very slow. So in the CCS projects, they have measurements that measure the CO2 content, for example, so that you could easily or very quickly detect that.
Strickland: And can you talk a bit more about the Northern Lights project and tell us about its current status and what you’re working on next to help that project move forward?
Witte: Yeah, so Northern Lights describes itself as the world’s first open-source CO2 transport and storage project. It doesn’t mean open-source in the sense like in software. What it means in this case is that they essentially offer carbon capture and storage as a service so that if you’re a client, for example, you’re a steel factory and you install CCS technology to capture the carbon, you can now sell it to Northern Lights, and they will send a vessel, pick up the CO2, and then store it permanently using geological storage. So the idea is that Northern Lights builds the transportation and storage infrastructure, and then sells that as a service to companies like— I think the first client that they signed a contract with is a Dutch petrochemical company called Yara Sluiskil.
Strickland: And to be sure I understand, you said that the companies that are generating the CO2 are selling the CO2 to the Northern Lights project, or is it the other way around?
Witte: How I think about it more as they pay for the service that Northern Lights picks up the CO2 and then stores it for them.
Strickland: And one last question. If I remember right, Microsoft was really emphasizing open-source for this research. And what exactly is open-source here?
Witte: So the training datasets that we create, we’re planning to make those open-source, the code to generate the datasets as well as the code to train the models. I’m actually currently working on open-sourcing that, and I think by the time this interview comes out, hopefully it will already be open-source, and you should be able to find that at the Microsoft Research industry website. But yeah, we really want to emphasize the open-sourceness of not just CCS itself, but the technology and the monitoring part, because I think in order for the public to accept CCS and have confidence that it works and that it’s safe, you have to have accountability and you have to be able to put that data, for example, the monitoring data out there, as well as the software. Traditionally, in oil and gas exploration, the data and also the codes to run simulations and to do monitoring are. I mean, the companies keep it very tight to the chest. There’s not a whole lot of open-source data or codes. And luckily, with CCS we already see that changing. Companies like Northern Lights are actually putting their data on the web as open-source material for people to use. But of course, the data is only part of the story. You also need to be able to do something with that data, process it in the cloud using HPC and AI. And so we work really hard on making some of these components accessible, and that does not only include the AI models, but also, for example, API suppresses data in the cloud using HPC. But eventually, we were really hoping to-- once we have all the data and the codes available, that it’s really helping the overall community to accelerate innovations and build on top of these tools and datasets.
Strickland: And that’s a really good place to end. Philip, thank you so much for joining us today on Fixing the Future. I really appreciate it.
Witte: Yeah, thanks, Eliza. I really enjoyed the conversation.
Strickland: Today on fixing the future, we were talking with Philip Witte about using AI to help with carbon capture and storage. I’m Eliza Strickland for IEEE Spectrum, and I hope you’ll join us next time.
New technologies often are introduced through spectacle: Think of the historic demonstrations carried out by Faraday, Edison, Morse, and Bell, or, more recently, by Steve Jobs onstage in his black turtleneck at Macworld 2007, holding the first iPhone. Indeed, hyped-up product announcements at industry events like the Consumer Electronics Show (now CES) and the Game Developers Conference have become regular features of the digital world.
There’s also a parallel tradition—less flashy but no less important—of industry events that focus attention on digital infrastructure. Several of these events, such as the first public demo of the ARPANET in 1972, or the mid-1980s conferences now known as Interop, alerted experts to new technologies, and, in some cases, altered the balance between competing approaches.
Although many of these gatherings have escaped the attention of historians, our view is that these events should be recognized more fully as moments where experts could glimpse possible futures and judge for themselves what was most likely to happen. Here we describe a few of these do-or-die moments. You may not have heard of any of these events—but if you were there, you will never forget them.
The ARPANET was one of the first networks to apply packet switching, an approach to communications that breaks messages into discrete chunks, or packets, of data. It was a major departure from circuit-switched networks, such as telephone networks, for which communication partners were linked through a dedicated circuit.
The first node of the ARPANET was installed at the University of California, Los Angeles, in 1969. But the ARPANET didn’t take off immediately. And by mid-1971, program director Lawrence Roberts of the Advanced Research Projects Agency (ARPA) was becoming impatient with the slow pace at which ARPA-funded researchers were getting connected. One of these researchers, Bob Kahn, suggested that Roberts organize a public demonstration of the ARPANET, both to educate other researchers about the network’s capabilities and to encourage new partners to support the initiative. Once Kahn found a venue for the demo—at the International Conference on Computer Communications (ICCC), to be held in Washington, D.C., in late October of 1972—he worked feverishly to get it organized.
Kahn recruited about 50 people to act as facilitators, including the ARPA-funded researchers Vint Cerf, Robert Metcalfe, and Jon Postel, all of whom were destined for networking fame. Kahn’s plan called for a TIP—short for Terminal Interface Processor—to be installed at the Hilton Hotel, the site of the ICCC. From there, attendees could log on to one of the ARPANET hosts and run an application remotely.
As these hand drawings from the time show, in December 1969 the ARPANET had just four nodes [top left]. That number grew to 15 by April 1971 [top right]. In an effort to speed the expansion further, network advocates organized a demonstration at the International Conference on Computer Communications in Washington, D.C., in October 1972. That meeting helped to grow the ARPANET, which by May of 1973 included some three dozen nodes [bottom].Computer History Museum
For this to work smoothly, Kahn arranged for various applications (called “scenarios”) to be created and tested. He also had to convince manufacturers to loan, install, and configure terminals. And he had to work with the hotel to prepare the room for the demo and arrange with AT&T to run leased lines to the Hilton’s ballroom.
The ICCC would prove to be for packet switching what the 1876 Centennial Exposition in Philadelphia was for the telephone: the public unveiling of what would eventually lead to a technological discontinuity.
For the hundreds of computer-communications professionals, government employees, and academic researchers attending the ICCC, the demo permanently changed their perceptions of a computer as a single machine locked in an air-conditioned room. The TIP was on a raised floor in the middle of the ballroom, with dozens of connected computer terminals circled around it and dozens of ARPA scientists milling about, eager to show off their pride and joy.
To sit at a terminal and with a few keystrokes be connected through the TIP, to the ARPANET, and then to applications running on computers at dozens of universities and research facilities must have felt like a visit to an alien world. And for the ARPA scientists involved, the bonds formed from staging the demonstration left them heady and optimistic about the future they were creating.
Researchers in government, academia, and industry struggled over the next several years to realize the potential of what they had seen. How could they scale up and simplify the capabilities that Kahn and company spent a year bringing to the Hilton ballroom? One major problem was the cost and fragility of stringing a dedicated cable from every computer to every terminal. Several parties converged on a similar solution: a local area network, where one “local” cable could traverse an entire facility with all terminals and computers connected to it.
Users in large organizations—including the U.S. Air Force, which had decades of experience and investments in computer communications—had the most to gain from solutions to these problems. To promote cooperation, Robert Rosenthal at the U.S. National Bureau of Standards and Norman Meisner at Mitre, a federally funded R&D organization, arranged a series of workshops in early 1979 to explore “Local Area Network Protocols.” Their goal was to provide a mechanism for sharing and obtaining results from the latest research—especially knowledge that was not available in the published literature. When Rosenthal and Meisner contacted potential participants, it became clear that while virtually everyone working on local area networking sensed its importance, they all expressed confusion over what to do about it.
When it came to sorting out the solution, a meeting Rosenthal and Meisner organized in May 1979 proved to have enduring significance. The Local Area Communications Network Symposium, held at the Copley Plaza Hotel in Boston, featured five formal sessions, panel discussions, and twelve workshops. Rosenthal was astonished when about 400 people showed up. For most, it was a formative event, comparable in importance to the ARPANET demonstration in 1972. “There was electricity in the air,” Rosenthal recalled in a 1988 interview with one of us (Pelkey). “You had leaders [like] Bob Metcalfe saying: ‘The world’s going to be a better place.’ ”
Bruce Hunt of Zilog remembers “being amazed at how many people were really interested in local area networks,” and feeling satisfied that the instinct of the researchers involved—that they were onto something really important—was validated. And it wasn’t just hype by academics: Within a couple of months, three new companies were formed—Sytek, 3Com, and Ungermann-Bass. Emboldened by the clear demand for commercial networking equipment, these startups raised millions from investors and immediately began selling products for local area networking.
More and more professionals came to realize that networking technology would generate important benefits. But the engineers involved had not settled many technical details about how these networks would work. And a growing number of alternatives soon would be considered for standardization by the IEEE, including a now well-known technology called Ethernet, which will celebrate the 50th anniversary of its standardization in May of this year.
In the meantime, work was underway on a broader approach to the challenge of creating standards for computer communication —one that could serve to link up different computer networks, a concept that began at this time to be called “internetworking.” In 1978, a few dozen experts from around the world held the first meeting for an ambitious project to create a comprehensive suite of standards and protocols for disparate networking technologies. This effort, known as Open Systems Interconnection (OSI), was hosted first by the International Organization for Standardization (ISO) and later, jointly, by ISO and the International Telecommunication Union. OSI’s founding premise was that a layered architecture would provide a way to pull together the standards, applications, and services that diverse groups around the world were developing.
The lower layers of OSI concerned the formatting, encoding, and transmission of data over networks. The upper layers included advanced capabilities and applications, such as electronic mail and directory services.
Several initiatives examined proposals for standards and applications within OSI’s seven-layer framework. One arose at General Motors, which had a strategic goal of using computer-based automation to combat growing competition from abroad. In 1981, GM held exploratory conversations with Digital Equipment Corp., Hewlett-Packard, and IBM. These discussions culminated in the release of GM’s Manufacturing Automation Protocol (MAP) version 1.0 in 1982.
Boeing, with similar goals, announced that it would work with the National Bureau of Standards to lead the creation of an OSI protocol stack for technical and office environments, later to be named Technical and Office Protocols (TOP).
Once again, potential users and customers sought out live demonstrations so that they could judge for themselves what was hype and what was reality. One highly anticipated demo took place at Autofact ’85, a conference whose name reflects the era’s deep preoccupation with factory automation.
Autofact ’85 drew about 30,000 people to Detroit, with some 200 vendors exhibiting MAP-compatible and other kinds of automation products. In addition to data-processing equipment such as computers and terminals, a variety of factory-automation systems, including robots, vision systems, and engineering workstations, were on display. With them, attendees explored a custom-designed version of the Towers of Hanoi game, and an application for interactive file transfer, access, and management.
Although Autofact ’85 was well attended and generally hailed in the trade press as a success, some were put off by its focus on things to come. As one press account put it, “On the show floor, there are plenty of demonstrations but few available products.” The lingering questions around actual commercial applications gave promoters reason to continue organizing public demonstrations, such as the Enterprise Networking Event (ENE) in Baltimore, in June 1988.
Autofact ’85, a conference whose name reflects people’s keen interest in factory automation during that time, brought about 30,000 people and 200 vendors to Detroit in November of 1985. SUNY Polytechnic Institute
The hope for ENE was to provide demos as well as showcase products that were actually available for purchase. All the U.S. computing giants—including IBM, HP, AT&T, Xerox, Data General, Wang Laboratories, and Honeywell—would be there, as well as leading European manufacturers and some smaller and younger companies with OSI-compatible products, such as Apple, Micom, Retix, Sun Microsystems, 3Com, and Touch Communications. Keynote speakers from the upper levels of the U.S. Department of Defense, Arthur Andersen, and the Commission of European Communities reinforced the message that all major stakeholders were behind the global adoption of OSI.
ENE confirmed both the hopes of OSI’s supporters and the fears of its critics. Vendors were able to demonstrate OSI standards for network management and electronic mail, but instead of products for sale, the 10,000 or so attendees saw mostly demonstrations of prototypes—a marginal improvement on Autofact ’85.
There was a painful reality to the computer networks of the mid-1980s: On the one hand, they held a vast potential to improve business practices and enhance productivity; on the other, actual products that could integrate the diversity of installed equipment and networks—and thus provide a robust means of internetworking—were very limited. The slow progress of MAP and TOP products left an opening for alternative approaches.
And the most promising of those approaches was to rely on the core protocols then in use for the ARPANET: Transmission Control Protocol and Internet Protocol (known to insiders as TCP/IP). A broad market for suitable equipment hadn’t yet developed, but the community of experts that had grown around the ARPANET was increasingly active in promoting the commercial adoption of such products.
One of the chief promoters was Dan Lynch, a consultant who was instrumental in managing the ARPANET’s transition to TCP in 1983. Lynch led the planning of a workshop in Monterey, Calif., in August 1986, where equipment vendors could learn about TCP/IP. Lynch wanted to get the apostles of TCP “to come out of their ivory towers” and provide some guidance for vendors implementing their protocols. And they did, as Lynch recalled in a 1988 interview, where he called the workshop “outrageously successful.”
This meeting, the first TCP/IP Vendors Workshop, featured a mix of leaders from the TCP/IP research community and representatives from 65 vendors, such as Ungermann-Bass and Excelan. Lynch continued this trade-show-like approach with the TCP/IP Interoperability Conference in Monterey in March 1987 and the 2nd TCP/IP Interoperability Conference in Arlington, Va., in December of the same year.
Lynch’s strategy for TCP/IP seemed to be gaining momentum, as evidenced by an article in Data Communications in November 1987, which neatly summarized the state of affairs: “By the end of 1986, there were more than 100 vendor offerings of TCP/IP and its associated DARPA protocols. Moreover, major vendors, including IBM and Digital Equipment Corp. have recently begun to offer TCP/IP as part of their product lines…. While the long-term strategic direction taken by most companies is in the implementation of the OSI model and its protocols, TCP/IP appears to be solving the short-term problems of connections between networks.”
The market-research firm Infonetics published a report in May 1988 that documented a “dramatic increase in the commercialization of TCP/IP” and noted that increasing numbers of users were seeking solutions to integrate diverse computer equipment and networks. “Every sector of the market is planning to purchase TCP/IP products in the next year,” the report stated. “There is no indication that OSI is affecting purchase intent.”
At the time, Lynch was planning a new venue to promote the adoption of the protocols used for the ARPANET: the TCP/IP Interoperability Exhibition and Solutions Showcase, to be held in Santa Clara, Calif., in September 1988. And he gave the event a slick new title: Interop.
One of the key industry conferences that helped shape the Internet was the 1988 TCP/IP Interoperability Exhibition and Solutions Showcase in Santa Clara, Calif., which was given the shorter, catchier name “Interop.” Margot Simmons
Interop featured lots of products: “every medium, every bridge box, every router you can imagine,” according to Peter de Vries of the Wollongong Group, which was responsible for putting together the network at Interop. That network provided connections among all vendors on display, including equipment available for purchase from Cisco Systems, Proteon, and Wellfleet Communications.
Using TCP/IP, attendees could traverse links to NSFNET, the regional BARRNET in San Francisco, and a variety of other networks. Vendors could participate in TCP/IP “bake-offs,” where they could check to see whether their equipment interoperated with other vendors’ products. Self-appointed “net police” went so far as to hand out “tickets” to implementations that did not comply with the TCP/IP specifications.
In many respects, Interop ’88 was far more successful than ENE. It featured working products from more vendors than did ENE. And whereas ENE carried the burden of people’s expectations that it would provide comprehensive solutions for large-scale manufacturing, office, and government procurement, Interop took on the immediate and narrower problems of network interconnection. In the “age of standards,” as an article in Data Communications referred to that time, this focus on product compatibility, interoperability, and connectivity energized the estimated 5,000 attendees as well as the market for TCP/IP products.
The stage was now set for innovations that would change global society: the invention of the World Wide Web the following year and the privatization of the NSFNET/Internet backbone in the mid-1990s. The advances in global computer networking that have come since then all rest on that initial foundation.
Accounts of the beginnings of modern computing often include dramatic descriptions of a conference that has since become known as “the Mother of all Demos”—a 1968 joint meeting of the Association for Computing Machinery and the IEEE Computer Society where ARPA-funded researcher Douglas Engelbart gave a 90-minute presentation that included the use of windows, hypertext, videoconferencing, and the computer mouse, among other innovations. His demo is rightly recognized as a turning point for expanding the realm of the possible in personal computing. But mind-expanding possibilities were also on display—and sometimes even for sale—at the five meetings we’ve described here. In our view, the contribution of these industry events to the development of today’s world of computing shouldn’t be forgotten, because the connection of different kinds of computers is the advance that has transformed our lives.
Loring Robbins and Andrew Russell dedicate this article to their coauthor and longtime friend James Pelkey, who died shortly before it was published.
A rocket carrying CubeSats launched into Earth orbit two years ago, on 22 March 2021. Two of those CubeSats represented competing approaches to bringing the Internet of Things (IoT) to space. One, operated by Lacuna Space, uses a protocol called LoRaWAN, a long-range, low-power protocol owned by Semtech. The other, owned by Sateliot, uses the narrowband IoT protocol, following in the footsteps of OQ Technology, which launched a similar IoT satellite demonstration in 2019. And separately, in late 2022, the cellular industry standard-setter 3GPP incorporated satellite-based 5G into standard cellular service with its release 17.
In other words, there is now an IoT space race.
In addition to Lacuna and Sateliot, OQ Technology is also nipping at the heels of satellite telecom incumbents such as Iridium, Orbcomm, and Inmarsat for a share of the growing satellite-IoT subscriber market. OQ Technology has three satellites in low Earth orbit and plans to launch seven more this year, says OQ Technology’s chief innovation officer, Prasanna Nagarajan. OQ has paying customers in the oil and gas, agriculture, and transport logistics industries.
Sateliot, based in Barcelona, has the satellite it launched in 2021 in orbit and plans to launch four more this year, says Sateliot’s business development manager, Paula Caudet. The company is inviting early adopters to sample its service for free this year while it builds more coverage. “Certain use cases are fine with flybys every few hours, such as agricultural sensors,” Caudet says. OQ Technology claims it will launch enough satellites to offer at least hourly coverage by 2024 and near-real-time coverage later that year. Sateliot is perhaps one year behind OQ Technology.
Incumbent satellite operators are already offering IoT coverage, but so far they require specific IoT hardware tuned to their spectrum bands and protocols. Insurgent companies that make use of the 3GPP release 17 standard will be able to offer satellite connectivity to devices originally designed to connect only to cellular towers.
New companies also see an opportunity to offer lower, more attractive pricing. “Legacy satellite providers were charging maybe [US] $100 for a few kilobits of data, and customers are not willing to pay so much for IoT,” says Nagarajan. “There seemed to be a huge market gap.” Another company, Swarm, which is a subsidiary of SpaceX, offers low-bandwidth connectivity via proprietary devices to its tiny satellites for $5 per month.
Thanks to shared launch infrastructure and cheaper IoT-compatible modules and satellites, new firms can compete with companies that have had satellites in orbit for decades. More and more hardware and services are available on an off-the-shelf basis. “An IoT-standard module is maybe 8 or 10 euros, versus 300 euros for satellite-specific modules,” says Caudet.
In fact, Sateliot contracted the construction of its first satellite to Open Cosmos. Open Cosmos mission manager Jordi Castellví says that CubeSat subsystems and certain specialized services are now available online from suppliers including AlénSpace, CubeSatShop, EnduroSat, and Isispace, among others.
By building constellations of hundreds of satellites with IoT modules in low Earth orbit, IoT-satellite companies will be able to save money on hardware and still detect the faint signals from IoT gateways or even individual IoT sensors, such as those aboard shipping containers packed onto cargo ships at sea. They won’t move as much data as voice and broadband offerings in the works from AST SpaceMobile and Lynk Global’s larger and more complex satellites, for example, but they may be able to meet growing demand for narrowband applications.
OQ Technology has its own licensed spectrum and can operate as an independent network operator for IoT users with the latest 3GPP release—although at first most users might not have direct contact with such providers; both Sateliot and OQ Technology have partnered with existing mobile-network operators to offer a sort of global IoT roaming package. For example, while a cargo ship is in port, a customer’s onboard IoT device will transmit via the local cellular network. Farther out at sea, the device will switch to transmitting to satellites overhead. “The next step is being able to integrate cellular and satellite services,” Caudet says.
Can advanced semiconductors cut emissions of greenhouse gases enough to make a difference in the struggle to halt climate change? The answer is a resounding yes. Such a change is actually well underway.
Starting around 2001, the compound semiconductor gallium nitride fomented a revolution in lighting that has been, by some measures, the fastest technology shift in human history. In just two decades, the share of the global lighting market held by gallium-nitride-based light-emitting diodes has gone from zero to more than 50 percent, according to a study by the International Energy Agency. The research firm Mordor Intelligence recently predicted that, worldwide, LED lighting will be responsible for cutting the electricity used for lighting by 30 to 40 percent over the next seven years. Globally, lighting accounts for about 20 percent of electricity use and 6 percent of carbon dioxide emissions, according to the United Nations Environment Program.
Each wafer contains hundreds of state-of-the-art power transistorsPeter Adams
This revolution is nowhere near done. Indeed, it is about to jump to a higher level. The very semiconductor technology that has transformed the lighting industry, gallium nitride (GaN), is also part of a revolution in power electronics that is now gathering steam. It is one of two semiconductors—the other being silicon carbide (SiC)—that have begun displacing silicon-based electronics in enormous and vital categories of power electronics.
GaN and SiC devices perform better and are more efficient than the silicon components they are replacing. There are countless billions of these devices all over the world, and many of them operate for hours every day, so the energy savings are going to be substantial. The rise of GaN and SiC power electronics will ultimately have a greater positive impact on the planet’s climate than will the replacement of incandescent and other legacy lighting by GaN LEDs.
Virtually everywhere that alternating current must be transformed to direct current or vice versa, there will be fewer wasted watts. This conversion happens in your phone’s or laptop’s wall charger, in the much larger chargers and inverters that power electric vehicles, and elsewhere. And there will be similar savings as other silicon strongholds fall to the new semiconductors, too. Wireless base-station amplifiers are among the growing applications for which these emerging semiconductors are clearly superior. In the effort to mitigate climate change, eliminating waste in power consumption is the low-hanging fruit, and these semiconductors are the way we’ll harvest it.
This is a new instance of a familiar pattern in technology history: two competing innovations coming to fruition at the same time. How will it all shake out? In which applications will SiC dominate, and in which will GaN prevail? A hard look at the relative strengths of these two semiconductors gives us some solid clues.
Before we get to the semiconductors themselves, let’s first consider why we need them. To begin with: Power conversion is everywhere. And it goes far beyond the little wall chargers that sustain our smartphones, tablets, laptops, and countless other gadgets.
Power conversion is the process that changes electricity from the form that’s available to the form required for a product to perform its function. Some energy is always lost in that conversion, and because some of these products run continuously, the energy savings can be enormous. Consider: Electricity consumption in the state of California remained essentially flat from 1980 even as the economic output of the state skyrocketed. One of the most important reasons why the demand remained flat is that the efficiency of refrigerators and air conditioners increased enormously over that period. The single-greatest factor in this improvement has been the use of variable-speed drives based on the insulated gate bipolar transistor (IGBT) and other power electronics, which greatly increased efficiency.
SiC and GaN are going to enable far greater reductions in emissions. GaN-based technologies alone could lead to a savings of over 1 billion tonnes of greenhouse gases in 2041 in just the United States and India, according to an analysis of publicly available data by Transphorm, a GaN-device company I cofounded in 2007. The data came from the International Energy Agency, Statista, and other sources. The same analysis indicates a 1,400-terawatt-hour energy savings—or 10 to 15 percent of the projected energy consumption by the two countries that year.
Like an ordinary transistor, a power transistor can act as an amplifying device or as a switch. An important example of the amplifying role is in wireless base stations, which amplify signals for transmission to smartphones. All over the world, the semiconductor used to fabricate the transistors in these amplifiers is shifting from a silicon technology called laterally diffused metal-oxide semiconductor (LDMOS) to GaN. The newer technology has many advantages, including a power-efficiency improvement of 10 percent or more depending on frequencies. In power-conversion applications, on the other hand, the transistor acts as a switch rather than as an amplifier. The standard technique is called pulse-width modulation. In a common type of motor controller, for example, pulses of direct-current electricity are fed to coils mounted on the motor’s rotor. These pulses set up a magnetic field that interacts with that of the motor’s stator, which makes the rotor spin. The speed of this rotation is controlled by altering the length of the pulses: A graph of these pulses is a square wave, and the longer the pulses are “on” rather than “off,” the more rotational speed and torque the motor provides. Power transistors accomplish the on-and-off switching.
Pulse-width modulation is also used in switching power supplies, one of the most common examples of power conversion. Switching power supplies are the type used to power virtually all personal computers, mobile devices, and appliances that run on DC. Basically, the input AC voltage is converted to DC, and then that DC is “chopped” into a high-frequency alternating-current square wave. This chopping is done by power transistors, which create the square wave by switching the DC on and off. The square wave is applied to a transformer that changes the amplitude of the wave to produce the desired output voltage. To get a steady DC output, the voltage from the transformer is rectified and filtered.
The important point here is that the characteristics of the power transistors determine, almost entirely, how well the circuits can perform pulse-width modulation—and therefore, how efficiently the controller regulates the voltage. An ideal power transistor would, when in the off state, completely block current flow even when the applied voltage is high. This characteristic is called high electric breakdown field strength, and it indicates how much voltage the semiconductor is able to withstand. On the other hand, when it is in the on state, this ideal transistor would have very low resistance to the flow of current. This feature results from very high mobility of the charges—electrons and holes—within the semiconductor’s crystalline lattice. Think of breakdown field strength and charge mobility as the yin and yang of a power semiconductor.
GaN transistors are very unusual because most of the current flowing through them is due to electron velocity rather than electron charge.
GaN and SiC come much closer to this ideal than the silicon semiconductors they are replacing. First, consider breakdown field strength. Both GaN and SiC belong to a class called wide-bandgap semiconductors. The bandgap of a semiconductor is defined as the energy, in electron volts, needed for an electron in the semiconductor lattice to jump from the valence band to the conduction band. An electron in the valence band participates in the bonding of atoms within the crystal lattice, whereas in the conduction band electrons are free to move around in the lattice and conduct electricity.
In a semiconductor with a wide bandgap, the bonds between atoms are strong and so the material is usually able to withstand relatively high voltages before the bonds break and the transistor is said to break down. The bandgap of silicon is 1.12 electron volts, as compared with 3.40 eV for GaN. For the most common type of SiC, the band gap is 3.26 eV. [See table below, “The Wide-Bandgap Menagerie”]
Now let’s look at mobility, which is given in units of centimeters squared per volt second (cm2/V·s). The product of mobility and electric field yields the velocity of the electron, and the higher the velocity the higher the current carried for a given amount of moving charge. For silicon this figure is 1,450; for SiC it is around 950; and for GaN, about 2,000. GaN’s unusually high value is the reason why it can be used not only in power-conversion applications but also in microwave amplifiers. GaN transistors can amplify signals with frequencies as high as 100 gigahertz—far above the 3 to 4 GHz generally regarded as the maximum for silicon LDMOS. For reference, 5G’s millimeter-wave frequencies top out at 52.6 GHz. This highest 5G band is not yet widely used, however, frequencies up to 75 GHz are being deployed in dish-to-dish communications, and researchers are now working with frequencies as high as 140 GHz for in-room communications. The appetite for bandwidth is insatiable.
These performance figures are important, but they’re not the only criteria by which GaN and SiC should be compared for any particular application. Other critical factors include ease of use and cost, for both the devices and the systems into which they are integrated. Taken together, these factors explain where and why each of these semiconductors has begun displacing silicon—and how their future competition may shake out.
The first commercially viable SiC transistor that was superior to silicon was introduced by Cree (now Wolfspeed) in 2011. It could block 1,200 volts and had a respectably low resistance of 80 milliohms when conducting current. Today there are three different kinds of SiC transistors on the market. There’s a trench MOSFET (metal-oxide semiconductor field-effect transistor) from Rohm; DMOSs (double-diffused MOSs) from Infineon Technologies, ON Semiconductor Corp., STMicroelectronics, Wolfspeed, and others; and a vertical-junction field-effect transistor from Qorvo.
One of the big advantages of SiC MOSFETs is their similarity to traditional silicon ones—even the packaging is identical. A SiC MOSFET operates in essentially the same way as an ordinary silicon MOSFET. There’s a source, a gate, and a drain. When the device is on, electrons flow from a heavily doped n-type source across a lightly doped bulk region before being “drained” through a conductive substrate. This similarity means that there’s little learning curve for engineers making the switch to SiC.
Compared to GaN, SiC has other advantages. SiC MOSFETs are inherently “fail-open” devices, meaning that if the control circuit fails for any reason the transistor stops conducting current. This is an important feature, because this characteristic largely eliminates the possibility that a failure could lead to a short circuit and a fire or explosion. (The price paid for this feature, however, is a lower electron mobility, which increases resistance when the device is on.)
GaN brings its own unique advantages. The semiconductor first established itself commercially in 2000 in the markets for light-emitting diodes and semiconductor lasers. It was the first semiconductor capable of reliably emitting bright green, blue, purple, and ultraviolet light. But long before this commercial breakthrough in optoelectronics, I and other researchers had already demonstrated the promise of GaN for high-power electronics. GaN LEDs caught on quickly because they filled a void for efficient lighting. But GaN for electronics had to prove itself superior to existing technologies: in particular, silicon CoolMOS transistors from Infineon for power electronics, and silicon-LDMOS and gallium-arsenide transistors for radio-frequency electronics.
GaN’s main advantage is its extremely high electron mobility. Electric current, the flow of charge, equals the concentration of the charges multiplied by their velocity. So you can get high current because of high concentration or high velocity or some combination of the two. The GaN transistor is unusual because most of the current flowing through the device is due to electron velocity rather than charge concentration. What this means in practice is that, in comparison with Si or SiC, less charge has to flow into the device to switch it on or off. That, in turn, reduces the energy needed for each switching cycle and contributes to high efficiency.
Meanwhile, GaN’s high electron mobility allows switching speeds on the order of 50 volts per nanosecond. That characteristic means power converters based on GaN transistors operate efficiently at frequencies in the multiple hundreds of kilohertz, as opposed to about 100 kilohertz for silicon or SiC.
Taken together, the high efficiency and high frequency enables the power converter based on GaN devices to be quite small and lightweight: High efficiency means smaller heat sinks, and operation at high frequencies means that the inductors and capacitors can be very small, too.
One disadvantage of GaN semiconductors is that they do not yet have a reliable insulator technology. This complicates the design of devices that are fail-safe—in other words, that fail open if the control circuit fails.
There are two options to achieve this normally off characteristic. One is to equip the transistor with a type of gate that removes the charge in the channel when there’s no voltage applied to the gate and that conducts current only on application of a positive voltage to that gate. These are called enhancement-mode devices. They are offered by EPC, GaN Systems, Infineon,Innoscience, and Navitas, for example. [See illustration, "Enhancement-Mode GaN Transistor"]
The other option is called the cascode solution. It uses a separate, low-loss silicon field-effect transistor to provide the fail-safe feature for the GaN transistor. This cascode solution is used by Power Integrations, Texas Instruments, and Transphorm. [See illustration, "Cascoded Depletion-Mode GaN Transistor"]
No comparison of semiconductors is complete without a consideration of costs. A rough rule of thumb is—smaller die size means lower cost. Die size is the physical area of the integrated circuit containing the devices.
SiC devices now generally have smaller dies than GaN ones. However, SiC’s substrate and fabrication costs are higher than those for GaN and, in general, the final device costs for applications at 5 kilowatts and higher are not much different today. Future trends, though, are likely to favor GaN. I base this belief on the relative simplicity of GaN devices, which will mean production costs low enough to overcome the larger die size.
That said, for GaN to be viable for many high-power applications that also demand high voltages, it must have a cost-effective, high-performance device rated for 1,200 V. After all, there are already SiC transistors available at that voltage. Currently, the closest commercially available GaN transistors are rated for 900 V, produced by Transphorm, which I cofounded with Primit Parikh. Lately, we have also demonstrated 1,200-V devices, fabricated on sapphire substrates, that have both electrical and thermal performance on a par with SiC devices.
Projections from the research firm Omdia for 1,200-V SiC MOSFETs indicate a price of 16 cents per ampere in 2025. In my estimation, because of the lower cost of GaN substrates, the price of first-generation 1,200-V GaN transistors in 2025 will be less than that of their SiC counterparts. Of course, that’s just my opinion; we’ll all know for sure how this will shake out in a couple of years.
With these relative advantages and disadvantages in mind, let’s consider individual applications, one by one, and shed some light on how things might develop.
• Electric vehicle inverters and converters: Tesla’s adoption of SiC in 2017 for the onboard, or traction, inverters for its Model 3 was an early and major win for the semiconductor. In an EV, the traction inverter converts the DC from the batteries to AC for the motor. The inverter also controls the speed of the motor by varying the frequency of the alternating current. Today, Mercedes-Benz and Lucid Motors are also using SiC in their inverters and other EV makers are planning to use SiC in upcoming models, according to news reports. The SiC devices are being supplied by Infineon, OnSemi, Rohm, Wolfspeed, and others. EV traction inverters typically range from about 35 kW to 100 kW for a small EV to about 400 kW for a large vehicle.
However, it’s too soon to call this contest for SiC. As I noted, to make inroads in this market, GaN suppliers will have to offer a 1,200-V device. EV electrical systems now typically operate at just 400 volts, but the Porsche Taycan has an 800-V system, as do EVs from Audi, Hyundai, and Kia. Other automakers are expected to follow their lead in coming years. (The Lucid Air has a 900-V system.) I expect to see the first commercial 1,200-V GaN transistors in 2025. These devices will be used not only in vehicles but also in high-speed public EV chargers.
The higher switching speeds possible with GaN will be a powerful advantage in EV inverters, because these switches employ what are called hard-switched techniques. Here, the way to enhance performance is to switch very fast from on to off to minimize the time when the device is both holding high voltage and passing high current.
Besides an inverter, an EV also typically has an onboard charger, which enables the vehicle to be charged from wall (mains) current by converting AC to DC. Here, again, GaN is very attractive, for the same reasons that make it a good choice for inverters.
• Electric-grid applications: Very-high-voltage power conversion for devices rated at 3 kV and higher will remain the domain of SiC for at least the next decade. These applications include systems to help stabilize the grid, convert AC to DC and back again at transmission-level voltages, and other uses.
• Phone, tablet, and laptop chargers: Starting in 2019, GaN-based wall chargers became available commercially from companies such as GaN Systems, Innoscience, Navitas, Power Integrations, and Transphorm. The high switching speeds of GaN coupled with its generally lower costs have made it the incumbent in lower-power markets (25 to 500 W), where these factors, along with small size and a robust supply chain, are paramount. These early GaN power converters had switching frequencies as high as 300 kHz and efficiencies above 92 percent. They set records for power density, with figures as high as 30 W per cubic inch (1.83 W/cm3)—roughly double the density of the silicon-based chargers they are replacing.
An automated system of probes applies a high voltage to stress test power transistors on a wafer. The automated system, at Transphorm, tests each one of some 500 die in minutes. Peter Adams
• Solar-power microinverters: Solar-power generation has taken off in recent years, in both grid-scale and distributed (household) applications. For every installation, an inverter is needed to convert the DC from the solar panels to AC to power a home or release the electricity to the grid. Today, grid-scale photovoltaic inverters are the domain of silicon IGBTs and SiC MOSFETs. But GaN will begin making inroads in the distributed solar market, particularly.
Traditionally, in these distributed installations, there was a single inverter box for all of the solar panels. But increasingly installers are favoring systems in which there is a separate microinverter for each panel, and the AC is combined before powering the house or feeding the grid. Such a setup means the system can monitor the operation of each panel in order to optimize the performance of the whole array.
Microinverter or traditional inverter systems are critical to the modern data center. Coupled with batteries they create an uninterruptible power supply to prevent outages. Also, all data centers use power-factor correction circuits, which adjust the power supply’s alternating-current waveforms to improve efficiency and remove characteristics that could damage equipment. And for these, GaN provides a low-loss and economical solution that is slowly displacing silicon.
• 5G and 6G base stations: GaN’s superior speed and high power density will enable it to win and ultimately dominate applications in the microwave regimes, notably 5G and 6G wireless, and commercial and military radar. The main competition here are arrays of silicon LDMOS devices, which are cheaper but have lower performance. Indeed, GaN has no real competitor at frequencies of 4 GHz and above.
For 5G and 6G wireless, the critical parameter is bandwidth, because it determines how much information the hardware can transmit efficiently. Next-generation 5G systems will have nearly 1 GHz of bandwidth, enabling blazingly fast video and other applications.
Microwave-communication systems that use silicon-on-insulator technologies provide a 5G+ solution using high-frequency silicon devices where each device’s low output power is overcome with large arrays of them. GaN and silicon will coexist for a while in this space. The winner in a specific application will be determined by a trade-off among system architecture, cost, and performance.
• Radar: The U.S. military is deploying many ground-based radar systems based on GaN electronics. These include the Ground/Air Task Oriented Radar and the Active Electronically Scanned Array Radar built by Northrup-Grumman for the U.S. Marine Corps. Raytheon’s SPY6 radar was delivered to the U.S. Navy and tested for the first time at sea in December 2022. The system greatly extends the range and sensitivity of shipborne radar.
Today, SiC dominates in EV inverters, and generally wherever voltage-blocking capability and power handling are paramount and where the frequency is low. GaN is the preferred technology where high-frequency performance matters, such as in base stations for 5G and 6G, and for radar and high-frequency power-conversion applications such as wall-plug adapters, microinverters, and power supplies.
But the tug-of-war between GaN and SiC is just beginning. Regardless of how the competition plays out, application by application and market by market, we can say for sure that the Earth’s environment will be a winner. Countless billions of tonnes of greenhouse gases will be avoided in coming years as this new cycle of technological replacement and rejuvenation wends its way inexorably forward.
The Big Picture features technology through the lens of photographers.
Every month, IEEE Spectrum selects the most stunning technology images recently captured by photographers around the world. We choose images that reflect an important advance, or a trend, or that are just mesmerizing to look at. We feature all images on our site, and one also appears on our monthly print edition.
Enjoy the latest images, and if you have suggestions, leave a comment below.
For many years, environmentalists have looked forward to the coming of net-zero-energy buildings. Much effort was devoted to making lighting, heating, and cooling more efficient so buildings consumed less energy. But the net-zero target would never have been reachable without innovations in renewable-energy generation that let structures generate power on-site. Now residential and commercial buildings can be outfitted with roofing tiles that double as solar panels, or with rooftop boxes like this low-profile unit that transforms gusts of wind into electric current. This WindBox turbine, installed on the roof of a building in Rouen, France, is 1.6 meters tall, and has a 4-square-meter footprint (leaving plenty of space for solar panels or tiles). The unit, which weighs130 kilograms, can generate up to 2,500 kilowatt-hours of electricity per year (enough to meet roughly one-quarter of the energy needs of a typical U.S. household).
This is the giant horn antenna that was used in physics research that led to the discovery of background cosmic radiation, which provided support for the big bang theory. Two Bell Labs researchers who were painstakingly attempting to eliminate noise from certain radio signals eventually realized that the noise didn’t arise from an antenna malfunction. It was, in fact, an artifact of the big bang, which created the cosmos. Now, this antenna, which was critical to their work, is under threat of being dismantled. The Holmdel, N.J., research site is now in private hands, and could be slated for rezoning and redevelopment, which might doom the instrument that made the Nobel Prize–winning discovery possible.
The incandescent lightbulb, in addition to being a world-changing invention, is the prototypical example of something that wastes a lot of energy, giving it off as heat instead of light. Our bodies slough off a ton of heat too. Because generating heat is an inescapable part of our metabolic processes, researchers have been working to turn a lightbulb moment—the idea of harnessing body heat so all that thermal energy isn’t wasted—into a practical device that yields electric power. Thermoelectric generators, or TEGs, have, in fact, been around for a while. But a new generation of TEGs uses cheaper, less toxic materials that convert both heat and kinetic energy to electricity with greater efficiency than earlier versions. On the TEG pictured here, hot and cold regions represented by the zebra’s stripes produce a temperature gradient that creates a voltage difference. The result: electrical current. Its creators, researchers at Gwangju Institute of Science and Technology, in South Korea, envision weaving such devices into fabrics so that someday our garments will double as power outlets for our ubiquitous portable electronic gadgets.
Millimeter-wave power amplifiers are necessary for applications that require the highest data rates possible—mostly communications across vast distances. But with a price tag topping US $1 million each, it is easy to see why there is an urgent push for production innovation to bring the cost way down. Diana Gamzina, founder and CEO of Davis, Calif., startup Elve Speed, embodies the mission to bring ultrahigh-speed wireless connectivity to remote and urban areas. To accomplish this, her company has taken to 3D printing millimeter-wave amps to sidestep the manual, high-precision manufacturing processes that make the price of the amps so stratospheric.
Video Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion.
Enjoy today’s videos!
GITAI conducted a demonstration of lunar base construction using two GITAI inchworm-type robotic arms and two GITAI Lunar Robotic Rovers in a simulated lunar environment and successfully completed all planned tasks. The GITAI robots have successfully passed various tests corresponding to Level 4 of NASA’s Technology Readiness Levels (TRL) in a simulated lunar environment in the desert.
[ GITAI ]
This is 30 minutes of Agility Robotics’ Digit being productive at ProMat. The fact that it gets boring and repetitive to watch reinforces how much this process needs robots, and is also remarkable because bipedal robots can now be seen as just another tool.
[ Agility Robotics ]
We are now one step closer to mimicking Baymax’ skin with softness and whole-body sensing, which may benefit social or task-based touch/interaction at large area. We constructed a robot arm with a soft skin and vision-based tactile sensing. We also showcase this method for our large-scale tactile sensor (TacLink) by demonstrating its use in two scenarios: namely whole-arm nonprehensile manipulation, and intuitive motion guidance using a custom-built tactile robot arm integrated with TacLink.
[ Paper ]
Meet Fifi, a software engineering team lead at Boston Dynamics. Hear her perspective on how she got into engineering, why she wouldn’t trust Stretch with her pet cactus, and much more—as she answers questions from kids and other curious minds.
[ Boston Dynamics ]
Take a look at this 7 ingredient printed dessert and ask yourself if conventional cooking appliances such as ovens, stovetops, and microwaves may one day be replaced by cooking devices that incorporate three-dimensional (3D) printers, lasers, or other software-driven processes.
[ Paper ]
What if you just loaded the robots onto the truck?!
[ Slip Robotics ]
As weird as this looks, it’s designed to reduce the burden on caregivers by automating tooth brushing.
[ RobotStart ]
Relay is still getting important work done in hospitals.
[ Relay Robotics ]
Real cars are expensive, simulation is fake, but MIT’s MiniCity is just the right compromise for developing safer autonomous vehicles.
[ Paper ]
Robot-to-human mechanical tool handover is a common task in a human-robot collaborative assembly where humans are performing complex, high-value tasks and robots are performing supporting tasks. We explore an approach to ensure the safe handover of mechanical tools to humans. Our experimental results indicate that our system can safely and effectively hand off many different types of tools. We have tested the system’s ability to successfully handle contingencies that may occur during the handover process.
[ USC Viterbi ]
Autonomous vehicle (AV) uncertainty is at an all-time high. Michigan Engineering researchers aim to change that. A team of researchers used artificial intelligence to train virtual vehicles that can challenge AVs in a virtual or augmented reality testing environment. The virtual cars were only fed safety-critical training data, making them better equipped to challenge AVs with more of those rare events in a shorter amount of time.
[ Michigan ]
All of the sea lamprey detection problems you never knew you had are now solved.
[ Paper ]
OTTO Motors is thrilled to announce the official launch of our newest autonomous mobile robot (AMR)—OTTO 600. We have also released a major update that makes industry-leading strides in software development. With this, the industry’s most comprehensive AMR fleet is unveiled, enabling manufacturers to automate any material handling job up to 4,200 lb.
[ OTTO Motors ]
From falling boxes to discarded mattresses, we prepare the Waymo Driver to identify and navigate around all kinds of debris on public roads. See how we use debris tests at our closed-course facilities to prepare our Waymo Driver for any foreign objects and debris it may encounter on public roads.
[ Waymo ]
Over 500 students participated in the 2022 Raytheon Technologies UK quadcopter challenge ... covering all the British Isles.
[ Raytheon ]
We are delighted to share a new research report that explores trends in patent maintenance behaviors revealed through the analysis of two decades of patent data.
By collecting and analyzing different data points, we explore the trends and directionality of patent filing and maintenance by jurisdiction and sector to understand what survives. The insights within the report create a clearer profile of patent maintenance behaviors, affording readers a unique perspective on the renewals landscape and the strategic value of annuities.
A new sensor could help reduce the number of accidents caused by impaired driving and could protect children left in hot cars. The Wireless Intelligent Sensing millimeter-wave radar system, developed by startup Pontosense, monitors vehicle occupants’ vital signs, and it can detect the presence of passengers in the vehicle and where they are seated.
The WISe system measures the driver’s vital signs including heart rate and breathing to detect fatigue and possible medical emergencies. WISe sends out signals with wavelengths short enough to measure the tiny adjustments in a person’s body from breathing and pulse. The signal echo is then analyzed by an artificial intelligence (AI) algorithm.
“There is an urgent need for this kind of technology,” says cofounder Alex S. Qi, the startup’s CEO. A recent study on the causes of motor vehicle accidents found that fatigue and medical emergencies were the top reasons.
WISe is the first in-market millimeter-wave wireless sensor used for that purpose in the automotive industry, Qi says. The system is expected to be installed in several vehicle models in the near future, he says.
Current in-car driver-monitoring systems require either cameras or contact sensors, but WISe works wirelessly. The system uses RF sensing to capture echos of the micromovements made by the driver’s or passenger’s body caused by heartbeats and breaths. WISe detects changes in the reflected signal’s phase—the relationship between radio signals that share the same space and frequency—to read the micromovements and vital signs.
“Think of the sensor as bouncing waves off of something, like how whales and bats use sonar and echolocation to gauge where objects are located around them. That’s the basis of our technology,” Qi says.
The sensor transmits millimeter waves from an antenna designed by cofounder and CTO Yihong Qi (Alex and Yihong are related). When the waves are reflected back to the sensor, the received echo allows WISe to “see” the small movements and gather data about the person’s health status.
The system checks heart rate variability and respiration rate. WISe takes the driver’s readings daily to discern what the person’s normal range is and displays the data on the car’s infotainment screen. The information is also stored in a microcomputer in the vehicle as well as external servers so it can be accessed later. The system is encrypted to protect the driver’s personal data from hackers.
The sensors can be installed in different locations in the vehicle, including behind the rearview mirror, behind the instrument panel, and between the driver and passenger seats. The device—about the size of a coin, with a 40-millimeter diameter—uses less than 10 watts of power.
The antenna Yihong designed is 10 nanometers to keep the sensor small and flexible, he says. It can transmit millimeter waves at 24, 60, or 77 gigahertz, depending on the vehicle.
“The wavelength at a higher frequency is very short,” Yihong says, “so the sensor is able to more accurately read the driver’s movements.”
The biggest challenge during the sensor’s development, Yihong says, was figuring out how to filter out external noise to ensure accurate readings.
“Take heart rate variability, for example,” he says. “In the hospital, when a patient is having their heart rate measured, sensors are directly attached to them for an accurate reading. But a contactless wireless sensor needs extra help because noises in the environment—such as seatbelts, the car’s engine, and even breathing—are affecting the sensor’s ability to differentiate between the person’s heart rate and irrelevant noise.”
To filter out the noise, Yihong developed signal processing software that uses AI to analyze the data. Algorithms clean up the data, generating a clear radar image for biometrics and communicating with the vehicle and the infotainment system if it suspects an issue with the driver or a passenger. The company says it takes 5 to 10 seconds for WISe to detect a change in the driver, such as an irregular heartbeat, and notify the vehicle.
Each car manufacturer will use a different system for alerting the driver if the measurements deviate from the norm, Alex says. It could involve setting off an alarm, slowing down the car or, for autonomous cars, safely pulling over to the side of the road.
The driver can override the system, the company says, if the alert is incorrect or unnecessary.
WISe can detect the presence of passengers in the vehicle and where they are seated. Pontosense
The current way to count the number of passengers in a car is through pressure sensors installed under each seat. But sometimes a sensor is inaccurate because it was triggered by a heavy object such as luggage. Pressure sensors also can fail to count children weighing less than 29 kilograms, according to a 2020 article published in IEEE Access.
Compared with a pressure sensor, WISe can more accurately differentiate whether a passenger is an adult, child, or pet through vital signs and size.
Determining a passenger’s size also matters for airbag deployment. After a crash, airbags eject out of the steering wheel, dashboard, or another location at about 27 kilometers per hour, according to the Insurance Institute for Highway Safety. People who weigh less than 68 kg or are shorter than 1.5 meters can be killed or severely injured because of the airbag’s deployment speed, Alex says. But WISe can communicate to the car to slow down the bag’s deployment by notifying the vehicle that a passenger doesn’t meet the minimum requirements.
The system can also warn whether children or pets have been left in the rear seat by reading their vitals. That could prevent heat stroke or death for those left behind in a hot vehicle. WISe can tell the vehicle to open windows or turn on an alarm to notify those in the area of a forgotten child or pet while the car is not running.
“In-cabin sensing will enable personalized, human-centered enhanced in-car experiences,” Yihong says.
The idea for Pontosense came about while Yihong and Alex were walking the showroom floor at the 2019 Consumer Electronics Show in Las Vegas.
“We saw many companies showcasing devices that could measure vital signs, but all needed two or more sensors to work, were invasive, and were not user-friendly,” Alex says. “They also relied on Wi-Fi.”
“The sensors used low-frequency bandwidth, which couldn’t make accurate readings,” Yihong adds. “They only detected a person if they walked one meter out in front of the sensor, and they couldn’t differentiate between multiple people in a group.”
Yihong’s work in antenna development, radio frequency and electromagnetic compatibility measurement, and digital signal processing algorithms spans decades. An IEEE Fellow, he holds more than 500 patents in China and the United States.
He started to think about how to make a more accurate, more user-friendly, wireless intelligent sensing device. By the time the trade show ended, he already had several ideas. The two decided to team up and develop a technology to be used in cars.
“Both of us have experienced driving while tired, and we thought a sensor would be a great tool to help decrease the number of accidents caused by impaired driving,” Alex says. He adds that they also wanted to reduce the number of deaths caused when people accidently leave a young child behind in a car on a hot day.
They succeeded in creating the wireless sensor in 2021 and founded Pontosense that year to bring the system to market. The startup, based in Toronto, employs more than 120 people, 25 of whom are on its research and development team.
Pontosense received an IEEE Hyper-Intelligence Technical Committee Award for Excellence in Hyper-Intelligence last year for “contributions on wireless intelligent sensing systems for human safety in automobiles,” in the industrial impact category.
Hyperintelligence uses interdisciplinary technologies that work together to accomplish complex tasks. The award is sponsored by the IEEE Sensors Council.
Pontosense participated in VentureLab’s capital investment program in 2021. The program helps participants write pitches, connect with investors, and develop long-term capital strategies. But raising funds wasn’t difficult, Alex says, as Pontosense isn’t his or Yihong’s first company. They’ve founded several other startups, which helped pave the way for the development and manufacturing of the sensors. Their companies include wireless communication device manufacturers Mercku and General Test Systems.
“We have the privilege to be able to mass-produce WISe with our network of companies,” Alex says. “Pontosense can produce hundreds of thousands of modules a month. Our goal for this year is to get the device in as many vehicles as possible so we can save as many lives as possible.”
Last year for Hands On, I gutted a defunct TRS-80 Model 100. The goal was to upgrade its 24 kilobytes of RAM and 2.4-megahertz, 8-bit CPU but keep the notebook computer’s lovely keyboard and LCD screen. That article was almost entirely about figuring out how to drive its squirrely 1980s-era LCD screen. I left the rest, as they say, as an exercise for the reader. After all, sending a stream of data from a new CPU to the Arduino Mega controlling the screen would be a trivial exercise, right?
No, folks, no it was not. IEEE Spectrum’s Hands On articles provide necessarily linear versions of how projects come together. It can give the impression we’re terribly clever, which has about the same relationship to reality as an influencer’s curated social-media feed. So every now and then I like to present a tale steeped in failure, just as a reminder that this is what engineering’s like sometimes.
To send screen data to the Mega, I had a choice between several methods that are supported by CircuitPython’s display driver libraries. I wanted to use a CircuitPython-powered microcontroller as the Model 100’s new brain because there’s a lot of existing software I could port over. In particular, CircuitPython’s display libraries would greatly simplify creating graphics and text and would automatically update the display. My choices were between a parallel interface and two serial interfaces: SPI and I2C.
The parallel interface would require at least 12 wires. SPI was better, being a four-wire interface. But I2C was best of all, requiring only two wires! Additionally, there are many breakout boards that support I2C, including storage and sensors of all types. One I2C bus can, in theory, support over a hundred I2C peripherals. I2C is much slower than SPI, but the Model 100’s delightfully chunky 240-by-64-pixel display is slower still. And I’d used I2C-based peripherals many times before in previous projects. I2C was the obvious choice. But there’s a big difference between using a peripheral created by a vendor and building one yourself.
The Grand Central controller [bottom] provides the new brains of the Tandy. Although the controller has the same form factor as the Arduino Mega, it has vastly more compute power.. A custom-built shield holds a supporting voltage-level shifter [top left] that converts the 3.3- and 5-volt logic levels used by the controllers appropriately. James Provost
On the circuit level, I 2C is built around an “open drain” principle. When the bus is idle, or when a 1 is being transmitted, pull-up resistors hold the lines at the voltage level indicating a logical high. Connecting a line to ground pulls it low. One line transmits pulses from the central controller as a clock signal. The other line handles data, with one bit transmitted per clock cycle. Devices recognize when traffic on the bus is intended for them because each has a unique 7-bit address. This address is prepended to any block of data bytes being sent. In theory, any clock speed and or logic level voltage could be used, as long as both the controller and peripheral accept them.
And there was my first and, I thought, only problem: The microcontrollers that ran CircuitPython and were computationally hefty enough for my needs ran on 3.3 volts, while the Arduino Mega uses the 5 V required to drive the LCD. An easy solve though—I’d just use a US $4 off-the-shelf logical level shifter, albeit a special type that’s compatible with I2C’s open-drain setup.
Using a $40 Adafruit Grand Central board as my central controller, I connected it to the Mega via the level shifter, and put some test code on both microcontrollers. The most basic I2C transaction possible is for the controller to send a peripheral’s address over the bus and get an acknowledgement back.
No response. After checking my code and wiring, I hooked up a logic analyzer to the bus. Out popped a lovely pulse train, which the analyzer software decoded as a stream of correctly formed addresses being sent out by the Grand Central controller as it scanned for peripherals, but with no acknowledgement from the Mega.
An I2C is a relatively low-speed bus that provides bidirectional communications between a controller and (in theory) over a hundred peripherals. A data and clock line are kept at a high voltage by pullup resistors The frequency of a clock line is controlled by the controller, while both the control and peripheral devices can affect the data line by connecting it to ground. A peripheral will take control of the data line only after it has been commanded to do so by the controller to avoid communication collisions. James Provost
I’ll skip over the next few hours of diagnostic failure, involving much gnashing of teeth and a dead end involving a quirk in how the SAMD chip at the heart of the Grand Central controller (and many others) has a hardware-accelerated I 2C interface that reportedly can’t go slower than a clock speed of 100 kilohertz. Eventually I hooked up the logic analyzer again, and scrolling up and down through the decoded pulses I finally noticed that the bus scan started not at address 0 or 1, but at 16. Now, when I’d picked an address for the Mega in my test code, I’d seen many descriptions of I2C in tutorials that said the possible range of addresses ran from 0 to 127. When I’d looked at what seemed like a pretty comprehensive description by Texas Instruments of how the I2C bus worked down to the electrical level, addresses were simply described as being 7-bit—that is, 0 to 127. So I’d picked 4, more or less at random.
But with the results of my logic scan in hand, I discovered that, oh, by the way, addresses 0 to 7 are actually unusable because they are reserved for various bus-management functions. So I went back to my original hardware setup, plugged in a nice two-digit address, and bingo! Everything worked just fine.
True, this headache was caused by my own lack of understanding of how I 2C works. The caveat that reserved addresses exist can be found in some tutorials, as well as more detailed documentation from folks like Texas Instruments. But in my defense, even in the best tutorials it’s usually pretty buried and easy to miss. (The vast majority of I2C instruction concerns the vastly more common situation where a grown-up has built the peripheral and hardwired it with a sensible address.) And even then, nothing would have told me that CircuitPython’s heartbeat scan would start at 16.
Oh well, time to press on with the upgrade. The rest should be pretty easy, though!
Nvidia says it has found a way to speed up a computation-limited step in the chipmaking process so that it happens 40 times as fast as today’s standard. Called inverse lithography, it’s a key tool that allows chipmakers to print nanometer-scale features using light with a longer wavelength than the size of those features. Inverse lithography’s use has been limited by the massive size of the needed computation. Nvidia’s answer, cuLitho, is a set of algorithms designed for use with GPUs, turns what has been two weeks of work into an overnight job.
The technology “will allow fabs to increase throughput, reduce their carbon footprint, and set the foundations for 2-nanometer [manufacturing processes] and beyond,” said Nvidia CEO Jensen Huang at the Nvidia GTC developer conference on Tuesday.
Leading logic-chip foundry Taiwan Semiconductor Manufacturing Co. (TSMC) will be qualifying cuLitho’s use in production starting in June, said Huang. Design automation software firm Synopsys plans to integrate software, too, and lithography equipment maker ASML plans to support cuLitho in its products as well.
Photolithography is basically the first step in the chipmaking process. It involves bouncing light off of a pattern called a photomask to project the forms of transistor structures and interconnects onto the wafer. (More mature technology uses transmissive photomasks instead of reflective ones, but the idea is the same.) It takes 98 photomasks to make an H100 GPU, Jensen said. The features projected from the photomask are smaller than the wavelength of light used—193 nanometers for the relatively large features and 13.5 nm for the finer bits. So, without the aid of tricks and design rules—collectively called optical proximity correction—you’d get only a blurry mess projected onto the wafer. But with optical proximity correction, the designs on the photomask only vaguely resemble the pattern of light on the chip.
With the need for finer and finer features, the corrected shapes on the photomask have become more and more elaborate and difficult to come up with. It would be much better to start with the pattern you want on the wafer and then calculate what pattern on the photomask would produce them. Such a scheme is called inverse lithography. Simple as it sounds, it’s quite difficult to compute, often taking weeks to compile.
In fact, it’s such a slog that it’s often reserved for use on only a few critical layers of leading-edge chips or just particularly thorny bits of them, according to data from the E-Beam Initiative, which periodically surveys the industry.
As chipmaking required finer and finer features, engineers had to produce more and more complex designs to project those features onto the silicon. Inverse lithography (ILT) is the latest development.Nvidia
The long computation time for lithography slows the development and improvement of chip technology Even a change to the thickness of a material can lead to the need for a new set of photomasks, notes Vivek K. Singh, vice president in the advanced technology group working on silicon manufacturing at Nvidia. Computing masks “has been a long pole in chip development,” he says. “If inverse lithography technology were sped up 40 times, would more companies use it on more layers? Surely.”
Part of the computation is an image problem that’s a natural fit for GPUs, says Singh. But at most, that can only cut the computation time in half. The rest is not so easy to make parallel. But over the past four years, with development partners including TSMC, Nvidia engineers have come up with a collection of algorithms for making the remaining work parallel and have packaged it as a software library for use with GPUs.
According to Nvidia, using the cuLitho lets 500 Nvidia DGX H100 computers do the work of 40,000 CPU systems. It can produce three to five times as many photomasks per day, drawing only 5 megawatts instead of 35 MW.
What’s more, the technology may deliver better results, according to Singh. CuLitho produces otherwise hard-to-calculate curvy polygons on the mask, which results in a greater depth of focus for the pattern cast onto the wafer. That depth of focus should lead to less variation across the wafer and therefore a greater yield of working chips per wafer, he says. In future, it also could mean fewer photomasks are needed; with inverse lithography, what must now be done with a double pattern might work with only one.
Nvidia is not the first to look to GPUs to accelerate inverse lithography technology. Silicon Valley-based D2S announced a GPU-based computer custom built for the problem in 2019. IEEE Spectrum reached out to D2S for comment, but the company did not reply before press time.
In mid-2021, the term “Web3” suddenly exploded into the public consciousness. As people scrambled to figure out what it was—cryptocurrencies? blockchain? nonfungible tokens?—venture capital firms were pouring money into new startups, over US $30 billion before the year was out.
Meanwhile, Molly White, a software engineer, started reading up on the tech in case that was the direction her career would be heading in. But she found herself taking a different direction: She launched the website Web3 Is Going Just Great, with the aim of tracking the scams and fraud in the cryptocurrency world. So far, she’s tallied $11.8 billion in money lost on the website’s Grift Counter. White answered five rapid-fire questions on the Web3 phenomenon and why she’s still not impressed.
How did you end up running a site like Web3 Is Going Just Great?
Molly White: When I started researching the topic, I was just seeing a lack of reporting on some of the downsides—you know, the hacks, the scams, the fraud. And so I decided I could do my part to try and fill that void to some extent, because I feel like it’s important that people get the full picture.
A lot of the projects you’re tracking involve cryptocurrency and blockchain technologies. Is that what “Web3” means? Are all of these terms synonymous?
White: It’s primarily a marketing term. And I think the industry benefits from how nebulous it is because it can mean whatever is most useful at that time. But broadly speaking, Web3 refers to blockchains underpinning everything you do online.
The crypto industry seemed like it might collapse when the cryptocurrency exchange FTX went bankrupt in November 2022, but you’re still updating the site with new projects. Is the industry still just trucking along, or has it changed after that event?
White: I think that FTX and the related collapses have been a really big hit to the crypto “brand,” but I think that the crypto industry is constantly working on finding the next big thing that they can sell retail investors on. And so that is very much underway at this point.
You can sort of see what’s happening as people start distancing themselves from FTX and saying that the FTX collapse wasn’t a flaw of crypto—it was a flaw of centralization or fraudulent actors. So I get the sense that people are going to be moving toward selling people on more decentralized finance products. That’s my guess of what the next big thing is going to be. It’s either that or crypto meets AI. We’ll see.
Have you ever come across a project that made you think, “Oh, maybe there’s a worthwhile reason for adding a blockchain to this”? Or are you still waiting for that project?
White: I’m mostly still waiting. Every once in a while there’s something where I can understand what they’re going for, but I don’t understand why they’ve picked a blockchain over a more efficient or less expensive solution. And sometimes there’s individual cases where people have benefited from crypto, but I don’t necessarily see that as scalable, or a strong argument for the technology itself.
Do you think you’re more skeptical about crypto and Web3 than when you started Web3 Is Going Just Great?
White: Well, I still have an open mind. I still tell people that I’m open to there being some killer use case that I just haven’t thought about. But seeing the constant fraud and how motivated people are by the economic forces in crypto to take advantage of people has really made me very skeptical and cynical about the industry.
The Ericsson Technology Review is now available in the IEEE Xplore Digital Library. The monthly magazine provides insights on emerging innovations that are shaping the future of information and communication technology.
The publication, which dates back to 1924, is published by Ericsson, a multinational networking and telecommunications company based in Stockholm.
An IEEE Xplore subscription is not required to access the freely available research papers.
“IEEE is a respected organization, and Ericsson has the ambition to reach even further into the academic community and research institutes with our cutting-edge research and development,” says Erik Ekudden, the company’s chief technology officer. “We believe that IEEE Xplore is a good channel for this target group.”
The Review in IEEE Xplore includes newly published articles plus those from the magazine’s archives going back to 2020, according to Naveen Maddali, senior product manager of content partnerships for IEEE Global Products and Marketing. There are now more than 80 articles in the digital library. Topics include computing, robotics, and signal processing.
“The Ericsson Technology Review is a valuable publication for anyone using IEEE Xplore,” Maddali says. “There’s a lot of useful content on telecommunications and communications for all types of the digital library’s users.”
“Ericsson has the ambition to reach even further into the academic community and research institutes with our cutting-edge research and development. IEEE Xplore is a good channel for this target group.”
Maddali says the project was volunteer-driven. The effort was supported by Ericsson’s CTO office following an initiative by IEEE Senior Member Glenn Parsons, principal standards advisor with Ericsson Canada. He was a member of the IEEE Publication Services and Products Board and the IEEE Technical Activities Board/PSPB Products and Services Committee that developed the third-party content hosting process. Parsons suggested that Ericsson Technology Review be used to do a trial run of the new hosting process.
The journal’s articles, written by Ericsson’s researchers, cover topics including communication, networking, and broadcast technologies; computing and processing; power and energy applications; robotics and control systems; and signal processing and analysis.
Ekudden adds that with the new partnership, “Ericsson hopes to increase the understanding of important technology trends. Mobile technology, 5G, and its included technology capabilities are a vital base for the ongoing digital transformation of enterprises and society.”
IEEE Xplore contains publications from other large companies in addition to Ericsson, including the IBM Journal of Research and Development and the Bell Labs Technical Journal. Hosting the publications in IEEE Xplore aligns with IEEE’s goal of providing practical content from leading organizations to those in industry, Maddali says.
Metal detecting can be a fun hobby, or it can be a task to be completed in deadly earnest—if the buried treasure you’re searching for includes land mines and explosive remnants of war. This is an enormous, dangerous problem: Something like 12,000 square kilometers worldwide are essentially useless and uninhabitable because of the threat of buried explosives, and thousands and thousands of people are injured or killed every year.
While there are many different ways of detecting mines and explosives, none of them are particularly quick or easy. For obvious reasons, sending a human out into a minefield with a metal detector is not the safest way of doing things. So, instead, people send anything else that they possibly can, from machines that can smash through minefields with brute force to well-trained rats that take a more passive approach by sniffing out explosive chemicals.
Because the majority of mines are triggered by pressure or direct proximity, it may seem that a drone would be the ideal way to detect them nonexplosively. However, unless you’re only detecting over a perfectly flat surface (and perhaps not even then) your detector won’t be positioned ideally most of the time, and you might miss something, which is not a viable option for mine detection.
But now a novel combination of a metal detector and a drone with 5 degrees of freedom is under development at the Autonomous Systems Lab at ETH Zurich. It may provide a viable solution to remote land-mine detection, by using careful sensing and localization along with some twisting motors to keep the detector reliably close to the ground.
The really tricky part of this whole thing is making sure that the metal detector stays at the correct orientation relative to the ground surface so there’s no dip in its effectiveness. With a conventional drone, this wouldn’t work at all, because every time the drone moves in any direction but up or down, it has to tilt, which is going to also tilt anything that’s attached to it. Unless you want to mount your metal detector on some kind of (likely complicated and heavy) gimbal system, you need a drone that can translate its position without tilting. Happily, such a drone not only exists but is commercially available.
The drone used in this research is made by a company called Voliro, and it’s a tricopter that uses rotating thruster nacelles that move independently of the body of the drone. It may not shock you to learn that Voliro (which has, in the past, made some really weird flying robots) is a startup with its roots in the Autonomous Systems Lab at ETH Zurich, the same place where the mine-detecting drone research is taking place.
So, now that you have a drone that’s theoretically capable of making your metal detector work, you need to design the control system that makes it work in practice. The system needs to be able to pilot the drone across a 3D surface that it has never seen before and might include obstacles. Meanwhile, it must prioritize the alignment of the detector. The researchers combine GPS with inertial measurements from a lidar mounted on the drone for absolute position and state estimation, and then autonomously plots and executes a “boustrophedon coverage path” across an area of interest. A boustrophedon—not a word I knew existed until just this minute—refers to something (usually writing) in which alternate lines are reversed (and mirrored). So, right to left, and then left to right.
Testing with metallic (nonexplosive) targets showed that this system does very well, even in areas with obstacles, overhead occlusion, and significant slope. Whether it’s ultimately field-useful or not will require some further investigation, but because the platform itself is commercial, off-the-shelf hardware, there’s a bit more room for optimism than there otherwise might be.
A research paper, “Resilient Terrain Navigation with a 5 DOF Metal Detector Drone” by Patrick Pfreundschuh, Rik Bähnemann, Tim Kazik, Thomas Mantel, Roland Siegwart, and Olov Andersson from the Autonomous Systems Lab at ETH Zurich, will be presented in May at ICRA 2023 in London.
Dead, and in a jacket and tie. That’s how he was on 1 December 1948, when two men found him slumped against a retaining wall on the beach at Somerton, a suburb of Adelaide, Australia.
The Somerton Man’s body was found on a beach in 1948. Nobody came forward to identify him. JAMES DURHAM
Police distributed a photograph, but no one came forward to claim the body. Eyewitnesses reported having seen the man, whom the newspapers dubbed the Somerton Man and who appeared to be in his early 40s, lying on the beach earlier, perhaps at one point moving his arm, and they had concluded that he was drunk. The place of death led the police to treat the case as a suicide, despite the apparent lack of a suicide note. The presence of blood in the stomach, a common consequence of poisoning, was noted at the autopsy. Several chemical assays failed to identify any poison; granted, the methods of the day were not up to the task.
There was speculation of foul play. Perhaps the man was a spy who had come in from the cold; 1948 was the year after the Cold War got its name. This line of thought was strengthened, a few months later, by codelike writings in a book that came to be associated with the case.
These speculations aside, the idea that a person could simply die in plain view and without friends or family was shocking. This was a man with an athletic build, wearing a nice suit, and showing no signs of having suffered violence. The problem nagged many people over the years, and eventually it took hold of me. In the late 2000s, I began working on the Somerton Man mystery, devoting perhaps 10 hours a week to the research over the course of about 15 years.
Throughout my career, I have always been interested in cracking mysteries. My students and I used computational linguistics to identify which of the three authors of The Federalist Papers—Alexander Hamilton, James Madison, and John Jay—was responsible for any given essay. We tried using the same method to confirm authorship of Biblical passages. More recently, we’ve been throwing some natural-language processing techniques into an effort to decode the Voynich Manuscript, an early 15th-century document written in an unknown language and an unknown script. These other projects yield to one or another key method of inquiry. The Somerton Man problem posed a broader challenge.
My one great advantage has been my access to students and to scientific instruments at the University of Adelaide, where I am a professor of electrical and electronic engineering. In 2009, I established a working group at the university’s Center for Biomedical Engineering.
One question surrounding the Somerton Man had already been solved by sleuths of a more literary bent. In 1949, a pathologist had found a bit of paper concealed in one of the dead man’s pockets, and on it were printed the words Tamám Shud, the Persian for “finished.” The phrase appears at the end of Edward FitzGerald’s translation of the Rubáiyát of Omar Khayyám, a poem that remains popular to this day.
The police asked the public for copies of the book in which the final page had been torn out. A man found such a book in his car, where apparently it had been thrown in through an open window. The book proved a match.
The back cover of the book also included scribbled letters, which were at first thought to constitute an encrypted message. But statistical tests carried out by my team showed that it was more likely a string of the initial letters of words. Through computational techniques, we eliminated all of the cryptographic codes known in the 1940s, leaving as a remaining possibility a one-time pad, in which each letter is based on a secret source text. We ransacked the poem itself and other texts, including the Bible and the Talmud, but we never identified a plausible source text. It could have been a pedestrian aide-mémoire—to list the names of horses in an upcoming race, for example. Moreover, our research indicates that it doesn’t have the structural sophistication of a code. The Persian phrase could have been the man’s farewell to the world: his suicide note.
Also scribbled on the back cover was a telephone number that led to one Jo Thomson, a woman who lived merely a five-minute walk from where the Somerton Man had been found. Interviewers then and decades later reported that she had seemed evasive; after her death, some of her relatives and friends said they speculated that she must have known the dead man. I discovered a possible clue: Thomson’s son was missing his lateral incisors, the two teeth that normally flank the central incisors. This condition, found in a very small percentage of the population, is often congenital; oddly, the Somerton Man had it, too. Were they related?
And yet the attempt to link Thomson to the body petered out. Early in the investigation, she told the police that she had given a copy of the Rubáiyát to a lieutenant in the Australian Army whom she had known during the war, and indeed, that man turned out to own a copy. But Thomson hadn’t seen him since 1945, he was very much alive, and the last page of his copy was still intact. A trail to nowhere, one of many that were to follow.
We engineers in the 21st century had several other items to examine. First was a plaster death mask that had been made six months after the man died, during which time the face had flattened. We tried several methods to reconstruct its original appearance: In 2013 we commissioned a picture by Greg O’Leary, a professional portrait artist. Then, in 2020, we approached Daniel Voshart, who designs graphics for Star Trek movies. He used a suite of professional AI tools to create a lifelike reconstruction of the Somerton Man. Later, we obtained another reconstruction by Michael Streed, a U.S. police sketch artist. We published these images, together with many isolated facts about the body, the teeth, and the clothing, in the hope of garnering insights from the public. No luck.
As the death mask had been molded directly off the Somerton Man’s head, neck, and upper body, some of the man’s hair was embedded in the plaster of Paris—a potential DNA gold mine. At the University of Adelaide, I had the assistance of a hair forensics expert, Janette Edson. In 2012, with the permission of the police, Janette used a magnifying glass to find where several hairs came together in a cluster. She was then able to pull out single strands without breaking them or damaging the plaster matrix. She thus secured the soft, spongy hair roots as well as several lengths of hair shaft. The received wisdom of forensic science at the time held that the hair shaft would be useless for DNA analysis without the hair root.
Janette performed our first DNA analysis in 2015 and, from the hair root, was able to place the sample within a maternal genetic lineage, or haplotype, known as “H,” which is widely spread around Europe. (Such maternally inherited DNA comes not from the nucleus of a cell but from the mitochondria.) The test therefore told us little we hadn’t already known. The concentration of DNA was far too low for the technology of the time to piece together the sequencing we needed.
Fortunately, sequencing tools continued to improve. In 2018, Guanchen Li and Jeremy Austin, also at the University of Adelaide, obtained the entire mitochondrial genome from hair-root material and narrowed down the maternal haplotype to H4a1a1a.
However, to identify Somerton Man using DNA databases, we needed to go to autosomal DNA—the kind that is inherited from both parents. There are more than 20 such databases, 23andMe and Ancestry being the largest. These databases require sequences of from 500,000 to 2,000,000 single nucleotide polymorphisms, or SNPs (pronounced “snips”). The concentration levels of autosomes in the human cell tend to be much lower than those of the mitochondria, and so Li and Austin were able to obtain only 50,000 SNPs, of which 16,000 were usable. This was a breakthrough, but it still wasn’t good enough to work on a database.
In 2022, at the suggestion of Colleen Fitzpatrick, a former NASA employee who had trained as a nuclear physicist but then became a forensic genetics expert, I sent a hair sample to Astrea Forensics, a DNA lab in the United States. This was our best hair-root sample, one that I had nervously guarded for 10 years. The result from Astrea came back—and it was a big flop.
Seemingly out of options, we tried a desperate move. We asked Astrea to analyze a 5-centimeter-long shaft of hair that had no root at all. Bang! The company retrieved 2 million SNPs. The identity of the Somerton Man was now within our reach.
So why did the rootless shaft work in our case?
The DNA analysis that police use for standard crime-solving relies on only 20 to 25 short tandem repeats (STRs) of DNA. That’s fine for police, who mostly do one-to-one matches to determine whether the DNA recovered at a crime scene matches a suspect’s DNA.
But finding distant cousins of the Somerton Man on genealogical databases constitutes a one-to-many search, and for that you typically need around 500,000 markers. For these genealogical searches, SNPs are used because they contain information on ethnicity and ancestry generally. Note that SNPs have around 50 to 150 base pairs of nucleotides, whereas typical STRs tend to be longer, containing 80 to 450 base pairs. The hair shaft contains DNA that is mostly fragmented, so it’s of little use when you’re seeking longer STR segments but it’s a great source of SNPs. So this is why crime forensics traditionally focused on the root and ignored the shaft, although this practice is now changing very slowly.
Another reason the shaft was such a trove of DNA is that keratin, its principal component, is a very tough protein, and it had protected the DNA fragments lodged within it. The 74-year-old soft spongy hair root, on the other hand, had not protected the DNA to the same extent. We set a world record for obtaining a human identification, using forensic genealogy, from the oldest piece of hair shaft. Several police departments in the United States now use hair shafts to retrieve DNA, as I am sure many will start to do in other countries, following our example.
Libraries of SNPs can be used to untangle the branching lines of descent in a family tree. We uploaded our 2 million SNPs to GEDmatch Pro, an online genealogical database located in Lake Worth, Fla. (and recently acquired by Qiagen, a biotech company based in the Netherlands). The closest match was a rather distant relative based in Victoria, Australia. Together with Colleen Fitzpatrick, I built out a family tree containing more than 4,000 people. On that tree we found a Charles Webb, son of a baker, born in 1905 in Melbourne, with no date of death recorded.
Charles never had children of his own, but he had five siblings, and I was able to locate some of their living descendants. Their DNA was a dead match. I also found a descendant of one of his maternal aunts, who agreed to undergo a test. When a positive result came through on 22 July 2022, we had all the evidence we needed. This was our champagne moment.
In late 2021, police in South Australia ordered an exhumation of the Somerton Man’s body for a thorough analysis of its DNA. At the time we prepared this article, they had not yet confirmed our result, but they did announce that they were “cautiously optimistic” about it.
All at once, we were able to fill in a lot of blank spaces. Webb was born on 16 November 1905, in Footscray, a suburb of Melbourne, and educated at a technical college, now Swinburne University of Technology. He later worked as an electrical technician at a factory that made electric hand drills. Our DNA tests confirmed he was not related to Thomson’s son, despite the coincidence of their missing lateral incisors.
We discovered that Webb had married a woman named Dorothy Robertson in 1941 and had separated from her in 1947. She filed for divorce on grounds of desertion, and the divorce lawyers visited his former place of work, confirming that he had quit around 1947 or 1948. But they could not determine what happened to him after that. The divorce finally came through in 1952; in those days, divorces in Australia were granted only five years after separation.
At the time of Webb’s death his family had become quite fragmented. His parents were dead, a brother and a nephew had died in the war, and his eldest brother was ill. One of his sisters died in 1955 and left him money in her will, mistakenly thinking he was still alive and living in another state. The lawyers administering the will were unable to locate Charles.
We got more than DNA from the hair: We also vaporized a strand of hair by scanning a laser along its length, a technique known as laser ablation. By performing mass spectrometry on the vapor, we were able to track Webb’s varying exposure to lead. A month before Webb’s death, his lead level was high, perhaps because he had been working with the metal, maybe soldering with it. Over the next month’s worth of hair growth, the lead concentration declined; it reached its lowest level at his death. This might be a sign that he had moved.
With a trove of photographs from family albums and other sources, we were able to compare the face of the young Webb with the artists’ reconstructions we had commissioned in 2013 and 2021 and the AI reconstruction we had commissioned in 2020. Interestingly, the AI reconstruction had best captured his likeness.
A group photograph, taken in 1921, of the Swinburne College football team, included a young Webb. Clues found in newspapers show that he continued to participate in various sports, which would explain the athletic condition of his body.
What’s interesting about solving such a case is how it relies on concepts that may seem counterintuitive to forensic biologists but are quite straightforward to an electronics engineer. For example, when dealing with a standard crime scene that uses only two dozen STR markers, one observes very strict protocols to ensure the integrity of the full set of STRs. When dealing with a case with 2 million SNPs, by contrast, things are more relaxed. Many of the old-school STR protocols don’t apply when you have access to a lot of information. Many SNPs can drop out, some can even be “noise,” the signal may not be clean—and yet you can still crack the case!
Engineers understand this concept well. It’s what we call graceful degradation—when, say, a few flipped bits on a digital video signal are hardly noticed. The same is true for a large SNP file.
And so, when Astrea retrieved the 2 million SNPs, the company didn’t rely on the traditional framework for DNA-sequencing reads. It used a completely different mathematical framework, called imputation. The concept of imputation is not yet fully appreciated by forensics experts who have a biological background. However, for an electronics engineer, the concept is similar to error correction: We infer and “impute” bits of information that have dropped out of a received digital signal. Such an approach is not possible with a few STRs, but when handling over a million SNPs, it’s a different ball game.
Much of the work on identifying Charles Webb from his genealogy had to be done manually because there are simply no automated tools for the task. As an electronics engineer, I now see possible ways to make tools that would speed up the process. One such tool my team has been working on, together with Colleen Fitzpatrick, is software that can input an entire family tree and represent all of the birth locations as colored dots on Google Earth. This helps to visualize geolocation when dealing with a large and complex family.
The Somerton Man case still has its mysteries. We cannot yet determine where Webb lived in his final weeks or what he was doing. Although the literary clue he left in his pocket was probably an elliptical suicide note, we cannot confirm the exact cause of death. There is still room for research; there is much we do not know.
This article appears in the April 2023 print issue as “Finding Somerton Man.”
This morning at the ProMat conference in Chicago, Agility Robotics is introducing the latest iteration of Digit, its bipedal multipurpose robot designed for near-term commercial success in warehouse and logistics operations. This version of Digit adds a head (for human-robot interaction) along with manipulators intended for the very first task that Digit will be performing, one that Agility hopes will be its entry point to a sustainable and profitable business bringing bipedal robots into the workplace.
So that’s a bit of background, and if you want more, you should absolutely read the article that Agility CTO and cofounder Jonathan Hurst wrote for us in 2019 talking about the origins of this bipedal (not humanoid, mind you) robot. And now that you’ve finished reading that, here’s a better look at the newest, fanciest version of Digit:
The most visually apparent change here is of course Digit’s head, which either makes the robot look much more normal or a little strange depending on how much success you’ve had imagining the neck-mounted lidar on the previous version as a head. The design of Digit’s head is carefully done—Digit is (again) a biped rather than a humanoid, in the sense that the head is not really intended to evoke a humanlike head, which is why it’s decidedly sideways in a way that human heads generally aren’t. But at the same time, the purpose of the head is to provide a human-robot interaction (HRI) focal point so that humans can naturally understand what Digit is doing. There’s still work to be done here; we’re told that this isn’t the final version, but it’s at the point where Agility can start working with customers to figure out what Digit needs to be using its head for in practice.
Digit’s hands are designed primarily for moving totes.Agility
Digit’s new hands are designed to do one thing: move totes, which are the plastic bins that control the flow of goods in a warehouse. They’re not especially humanlike, and they’re not fancy, but they’re exactly what Digit needs to do the job that it needs to do. This is that job:
Yup, that’s it: moving totes from some shelves to a conveyor belt (and eventually, putting totes back on those shelves). It’s not fancy or complicated and for a human, it’s mind-numbingly simple. It’s basically an automated process, except in a lot of warehouses, humans are doing the work that robots like Digit could be doing instead. Or, in many cases, humans aren’t doing this work, because nobody actually wants these jobs and companies are having a lot of trouble filling these positions anyway.
For a robot, a task like this is not easy at all, especially when you throw legs into the mix. But you can see why the legs are necessary: they give Digit the same workspace as a human within approximately the same footprint as a human, which is a requirement if the goal is to take over from humans without requiring time-consuming and costly infrastructure changes. This gives Digit a lot of potential, as Agility points out in today’s press release:
Digit is multipurpose, so it can execute a variety of tasks and adapt to many different workflows; a fleet of Digits will be able to switch between applications depending on current warehouse needs and seasonal shifts. Because Digit is also human-centric, meaning it is the size and shape of a human and is built to work in spaces designed for people, it is easy to deploy into existing warehouse operations and as-built infrastructure without costly retrofitting.
We should point out that while Digit is multipurpose in the sense that it can execute a variety of tasks, at the moment, it’s just doing this one thing. And while this one thing certainly has value, the application is not yet ready for deployment, since there’s a big gap between being able to do a task most of the time (which is where Digit is now) and being able to do a task robustly enough that someone will pay you for it (which is where Digit needs to get to). Agility has some real work to do, but the company is already launching a partner program for Digit’s first commercial customers. And that’s the other thing that has to happen here: At some point Agility has to make a whole bunch of robots, which is a huge challenge by itself. Rather than building a couple of robots at a time for friendly academics, Agility will need to build and deliver and support tens and eventually hundreds or thousands or billions of Digit units. No problem!Turning a robot from a research project into a platform that can make money by doing useful work has never been easy. And doing this with a robot that’s bipedal and is trying to do the same tasks as human workers has never been done before. It’s increasingly obvious that someone will make it happen at some point, but it’s hard to tell exactly when—if it’s anything like autonomous cars, it’s going to take way, way longer than it seems like it should. But with its partner program and a commitment to start manufacturing robots at scale soon, Agility is imposing an aggressive timeline on itself, with a plan to ship robots to its partners in early 2024, followed by general availability the following year.
Video Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion.
Enjoy today’s videos!
Inspired by the hardiness of bumblebees, MIT researchers have developed repair techniques that enable a bug-sized aerial robot to sustain severe damage to the actuators, or artificial muscles, that power its wings—but to still fly effectively.
[ MIT ]
This robot gripper is called DragonClaw, and do you really need to know anything else?
“Alas, DragonClaw wins again!”
[ AMTL ]
Here’s a good argument for having legs on a robot:
And here’s a less-good argument for having legs on a robot. But it’s still impressive!
[ ANYbotics ]
Always nice to see drones getting real work done! Also, when you offer your drone up for power-line inspections and promise that it won’t crash into anything, that’s confidence.
[ Skydio ]
Voxel robots have been extensively simulated because they’re easy to simulate, but not extensively built because they’re hard to build. But here are some that actually work.
[ Paper ]
Reinforcement learning (RL) has become a promising approach to developing controllers for quadrupedal robots. We explore an alternative to the position-based RL paradigm, by introducing a torque-based RL framework, where an RL policy directly predicts joint torques at a high frequency, thus circumventing the use of a PD controller. The proposed learning torque control framework is validated with extensive experiments, in which a quadruped is capable of traversing various terrain and resisting external disturbances while following user-specified commands.
[ Berkeley ]
In this work we show how bioinspired, 3D-printed snakeskins enhance the friction anisotropy and thus the slithering locomotion of a snake robot. Experiments have been conducted with a soft pneumatic snake robot in various indoor and outdoor settings.
[ Paper ]
For bipedal humanoid robots to successfully operate in the real world, they must be competent at simultaneously executing multiple motion tasks while reacting to unforeseen external disturbances in real time. We propose Kinodynamic Fabrics as an approach for the specification, solution, and simultaneous execution of multiple motion tasks in real time while being reactive to dynamism in the environment.
The RPD 35 from Built Robotics is the world’s first autonomous piling system. It combines four steps—layout, pile distribution, pile driving, and as-builts—into one package. With the RPD 35, a two-person crew can install pile more productivity than traditional methods.
[ Built Robotics ]
This work contributes a novel and modularized learning-based method for aerial robots navigating cluttered environments containing thin, hard-to-perceive obstacles without assuming access to a map or the full pose estimation of the robot.
[ ARL ]
The video shows a use case developed by the FZI with the assistance of the KIT: the multirobot retrieval of hazardous materials using two FZI robots as well as a KIT virtual-reality environment.
[ FZI ]
[ Soft Robtics ]
A year has passed since the launch of the ESA’s Rosalind Franklin rover mission was put on hold, but the work has not stopped for the ExoMars teams in Europe. In this program, the ESA Web TV crew travel back to Turin, Italy, to talk to the teams and watch as new tests are being conducted with the rover’s Earth twin, Amalia, while the real rover remains carefully stored in an ultraclean room.
[ ESA ]
Camilo Buscaron, chief technologist at AWS Robotics, sits down with Ramon Roche in this Behind the Tech episode to share his storied career in the robotics industry. Camilo explains how AWS provides a host of services for robotics developers from simulation and streaming to basic real-time cloud storage.
[ Behind the Tech ]
Every year, online job search firms collect data about the salaries, skills, and overall job market for tech professionals, generally focusing on software engineers
The numbers from job search firms Dice and Hired have been released. These 2022 numbers have been eagerly anticipated, given the turmoil generated by a spate of tech layoffs in the latter part of the year, which Dice estimates at more than 140,000. The data they collect doesn’t allow for apples-to-apples comparisons, but I’ve read through both reports, pulled out data from past years to give the numbers some perspective when possible, and summarized it in eight charts. Dice’s numbers come from a survey administered to its registered job seekers and site visitors between 16 August 2022 and 17 October 2022, for a total of 7,098 completed surveys. Hired’s analysis included data from 68,500 job candidates and 494,000 interview requests collected from the site between January 2021 through December 2022, supplemented by a survey of 1,300 software engineers.
According to Dice’s numbers, tech salaries grew 2.3 percent in 2022 compared with 2021, reflecting a steady upward trend since 2017 (with 2020 omitted due to the pandemic disruption). However, it’s clear that the 2022 news isn’t so good when considering inflation. These numbers have been adjusted from those previously reported by IEEE Spectrum; Dice recently tightened its survey to focus on tech professionals in more tech-specific job functions.
If you want the highest pay, it’s a no-brainer: Get yourself into the C-suite. That is not, of course, a particularly useful takeaway from Dice’s data. Perhaps of more interest is that scrum masters are commanding higher pay than data scientists, and that cloud and cybersecurity engineers continue to hold solid spots in the top ranks.
Specific skills often command a big pay boost, but exactly what skills are in demand is a moving target. Because the data from Dice and Hired—and the way they crunch it—varies widely, we present two charts. Dice, looking at average salaries, puts MapReduce at the top of its charts; Hired, looking at interview requests, puts Ruby on Rails and Ruby on top.
If you’re a software engineer and you don’t know Python, you’d better start studying. That’s the opinion of the 1,300 software engineers surveyed by Hired. If you don’t know C, however, don’t worry too much about that.
You probably don’t want to leave Silicon Valley if you’re looking for the highest pay. The San Francisco Bay Area hasn’t lost its dominant position at the top of the tech salary charts, discounting, however, the local cost-of-living differences. But some tech hubs are reducing the gap, with Tampa, Fla. salaries up 19 percent and Charlotte, N.C., salaries up 11 percent. Charlotte, in fact, edged out Austin for number nine in Dice’s rankings, and every cost-of-living calculator I checked considers Charlotte a significantly cheaper place to live. Hired, which considered a shorter list, puts Austin at number five.
That artificial intelligence tops the list of booming businesses this year is no surprise, given the attention brought to AI by the public release of Dall-E 2 and Chat GPT last year, alongside GPT-4 this month. Hired asked more than 1,300 software engineers their opinions on the hottest industries to watch out for in 2023, and AI and machine learning came out on top. Not so hot? E-commerce, media, and transportation.
Tesla’s investor day on 1 March began with a rambling, detailed discourse on energy and the environment before transitioning into a series of mostly predictable announcements and boasts. And then, out of nowhere, came an absolute bombshell: “We have designed our next drive unit, which uses a permanent-magnet motor, to not use any rare-earth elements at all,” declared Colin Campbell, Tesla’s director of power-train engineering.
It was a stunning disclosure that left most experts in permanent magnetism wary and perplexed. Alexander Gabay, a researcher at the University of Delaware, states flatly: “I am skeptical that any non-rare-earth permanent magnet could be used in a synchronous traction motor in the near future.” And at Uppsala University, in Sweden, Alena Vishina, a physicist, elaborates, “I’m not sure it’s possible to use only rare-earth-free materials to make a powerful and efficient motor.”
The problem here is physics, which not even Tesla can alter.
And at a recent magnetics conference Ping Liu, a professor at the University of Texas, in Arlington, asked other researchers what they thought of Tesla’s announcement. “No one fully understands this,” he reports. (Tesla did not respond to an e-mail asking for elaboration of Campbell’s comment.)
Tesla’s technical prowess should never be underestimated. But on the other hand, the company—and in particular, its CEO—has a history of making sporadic sensational claims that don’t pan out (we’re still waiting for that US $35,000 Model 3, for example).
The problem here is physics, which not even Tesla can alter. Permanent magnetism occurs in certain crystalline materials when the spins of electrons of some of the atoms in the crystal are forced to point in the same direction. The more of these aligned spins, the stronger the magnetism. For this, the ideal atoms are ones that have unpaired electrons swarming around the nucleus in what are known as 3d orbitals. Tops are iron, with four unpaired 3d electrons, and cobalt, with three.
But 3d electrons alone are not enough to make superstrong magnets. As researchers discovered decades ago, magnetic strength can be greatly improved by adding to the crystalline lattice atoms with unpaired electrons in the 4f orbital—notably the rare-earth elements neodymium, samarium, and dysprosium. These 4f electrons enhance a characteristic of the crystalline lattice called magnetic anisotropy—in effect, they promote adherence of the magnetic moments of the atoms to the specific directions in the crystal lattice. That, in turn, can be exploited to achieve high coercivity, the essential property that lets a permanent magnet stay magnetized. Also, through several complex physical mechanisms, the unpaired 4f electrons can amplify the magnetism of the crystal by coordinating and stabilizing the spin alignment of the 3d electrons in the lattice.
Since the 1980s, a permanent magnet based on a compound of neodymium, iron, and boron (NdFeB), has dominated high-performance applications, including motors, smartphones, loudspeakers, and wind-turbine generators. A 2019 study by Roskill Information Services, in London, found that more than 90 percent of the permanent magnets used in automotive traction motors were NdFeB.
So if not rare-earth permanent magnets for Tesla’s next motor, then what kind? Among experts willing to speculate, the choice was unanimous: ferrite magnets. Among the non-rare-earth permanent magnets invented so far, only two are in large-scale production: ferrites and another type called Alnico (aluminum nickel cobalt). Tesla isn’t going to use Alnico, a half-dozen experts contacted by IEEE Spectrum insisted. These magnets are weak and, more important, the world supply of cobalt is so fraught that they make up less than 2 percent of the permanent-magnet market.
There are more than a score of permanent magnets that use no rare-earth elements, or don’t use much of them. But none of these have made an impact outside the laboratory.
Ferrite magnets, based on a form of iron oxide, are cheap and account for nearly 30 percent of the permanent-magnet market by sales. But they, too, are weak (one major use is holding refrigerator doors shut). A key performance indicator of a permanent magnet is its maximum energy product, measured in megagauss-oersteds (MGOe). It reflects both the strength of a magnet as well as its coercivity. For the type of NdFeB commonly used in automotive traction motors, this value is generally around 35 MGOe. For the best ferrite magnets, it is around 4.
“Even if you get the best-performance ferrite magnet, you will have performance about five to 10 times below neodymium-iron-boron,” says Daniel Salazar Jaramillo, a magnetics researcher at the Basque Center for Materials, Applications, and Nanostructures, in Spain. So compared to a synchronous motor built with NdFeB magnets, one based on ferrite magnets will be much larger and heavier, much weaker, or some combination of the two.
To be sure, there are more than a score of other permanent magnets that use no rare-earth elements or don’t use much of them. But none of these have made an impact outside the laboratory. The list of attributes needed for a commercially successful permanent magnet includes high field strength, high coercivity, tolerance of high temperatures, good mechanical strength, ease of manufacturing, and lack of reliance on elements that are scarce, toxic, or problematic for some other reason. All of the candidates today fail to tick one or more of these boxes.
Iron-nitride magnets, such as this one from startup Niron Magnetics, are among the most promising of an emerging crop of permanent magnets that do not use rare-earth elements.Niron Magnetics
But give it a few more years, say some researchers, and one or two of these could very well break through. Among the most promising: iron nitride, Fe16N2. A Minneapolis startup, Niron Magnetics, is now commercializing technology that was pioneered with funding from ARPA-E by Jian Ping Wang at the University of Minnesota in the early 2000s, after earlier work at Hitachi. Niron’s executive vice president, Andy Blackburn, told Spectrum that the company intends to introduce its first product late in 2024. Blackburn says it will be a permanent magnet with an energy product above 10 MGOe, for which he anticipates applications in loudspeakers and sensors, among others. If it succeeds, it will be the first new commercial permanent magnet since NdFeB, 40 years ago, and the first commercial non-rare-earth permanent magnet since strontium ferrite, the best ferrite type, 60 years ago.
Niron’s first offering will be followed in 2025 by a magnet with an energy product above 30 MGOe, according to Blackburn. For this he makes a rather bold prediction: “It’ll have as good or better flux than neodymium. It’ll have the coercivity of a ferrite, and it’ll have the temperature coefficients of samarium cobalt”—better than NdFeB. If the magnet really manages to combine all those attributes (a big if), it would be very well suited for use in the traction motors of electric vehicles.
There will be more to come, Blackburn declares. “All these new nanoscale-engineering capabilities have allowed us to create materials that would have been impossible to make 20 years ago,” he says.
As technology continues to evolve, STEM education is needed more than ever. With the vast technical expertise of its 400,000-plus members and volunteers, IEEE is a leader in engineering and technology education. Its technical societies and its councils, sections, and regional groups offer educational events and resources at every level to support technical professions and prepare the workforce of tomorrow.
IEEE offers many ways to support the educational needs of learners. For preuniversity students, the organization offers summer camps and other opportunities to explore science, technology, engineering, and mathematics careers. IEEE’s continuing education courses allow professionals to stay up to date on technology, keep their skills sharp, and learn new things.
From 2 to 8 April, IEEE is highlighting resources available to students, educators, and technical professionals with IEEE Education Week. The annual celebration highlights educational opportunities provided by the world’s largest technical professional association and its many organizational units, societies, and councils.
Here are some of the events and resources available during this year’s Education Week.
Climate Change: IEEE’s Role in Bringing Technology Solutions to Meet the Challenge
3 April, noon to 1 p.m. EDT
IEEE President and CEO Saifur Rahman kicks off Education Week with a session on how the organization can serve as a vital connection between policymakers and the engineering and technology communities in bringing technological solutions to meet the universal challenge of climate change. Rahman plans to share how IEEE is committed to helping mitigate the effects of climate change through pragmatic and accessible technical solutions, as well as by providing engineers and technologists with a neutral space for discussion and action. The webinar also addresses the importance of educating the energy workforce.
3 April, 9 to 10 a.m. EDT
IEEE REACH (Raising Engineering Awareness through the Conduit of History) provides teachers with resources to help them explain the history of technology and the roles played by engineers. During this webinar, participants can learn how REACH can enhance the classroom experience.
Do This, Not That! Applying Multimedia Learning Principles in Your Online Module/Presentation to Enhance Comprehension
5 April, 11 to 11:45 a.m. EDT
Many people are sharing their expertise on TikTok, Youtube and other online platforms. When sharing knowledge in a multimedia-rich environment, there are research-proven principles that can be applied to enhance the presentation—which in turn promotes knowledge transfer. This webinar is designed to show participants how to apply the principles to their presentations.
Here are some additional offerings and resources available during IEEE Education Week.
For a list of webinars and events and more resources, visit the IEEE Education Week website.
IEEE-affiliated groups can participate in IEEE Education Week by offering events, resources, and special offers such as discounted courses. Additionally, a tool kit is available to help groups promote IEEE Education Week and their event through newsletters, social media, and more.
The Education Week website provides special offers and discounts as well. You also can support education programs by donating to the IEEE Foundation.
Check out the IEEE Education Week video to learn more.
You do not need to be a member to participate in IEEE Education Week; however, members receive discounted or free access to many of the events and resources.
If you’re not an IEEE member, now would be a great time to join.
My favorite approach to human-robot interaction is minimalism. I’ve met a lot of robots, and some of the ones that have most effectively captured my heart are those that express themselves through their fundamental simplicity and purity of purpose. What’s great about simple, purpose-driven robots is that they encourage humans to project needs and wants and personality onto them, letting us do a lot of the human-robot-interaction (HRI) heavy lifting.
In terms of simple, purpose-driven robots, you can’t do much better than a robotic trash barrel (or bin or can or what have you). And in a paper presented at HRI 2023 this week, researchers from Cornell explored what happened when random strangers interacted with a pair of autonomous trash barrels in NYC, with intermittently delightful results.
What’s especially cool about this, is how much HRI takes place around these robots that have essentially no explicit HRI features, since they’re literally just trash barrels on wheels. They don’t even have googly eyes! However, as the video notes, they’re controlled remotely by humans, so a lot of the movement-based expression they demonstrate likely comes from a human source—whether or not that’s intentional. These remote-controlled robots move much differently than an autonomous robot would. Folks who know how autonomous mobile robots work, expect such machines to perform slow, deliberate motions along smooth trajectories. But as an earlier paper on trash barrel robots describes, most people expect the opposite:
One peculiarity we discovered is that individuals appear to have a low confidence in autonomy, associating poor navigation and social mistakes with autonomy. In other words, people were more likely to think that the robot was computer controlled if they observed it getting stuck, bumping into obstacles, or ignoring people’s attempts to draw its attention.
We initially stumbled upon this perception when a less experienced robot driver was experimenting with the controls, actively moving the robot in strange patterns. An observer nearby asserted that the robot “has to be autonomous. It’s too erratic to be controlled by a person!”
A lot of inferred personality can come from robots that make mistakes or need help; in many contexts this is a bug, but for simple social robots where their purpose can easily be understood, it can turn into an endearing feature:
Due to the non-uniform pavement surface, the robots occasionally got stuck. People were keen to help the robots when they were in trouble. Some observers would proactively move chairs and obstacles to clear a path for the robots. Furthermore, people interpreted the back-and-forth wobbling motion as if the robots were nodding and agreeing with them, even when such motion was caused merely by uneven surfaces.
Another interesting thing going on here is how people expect that the robots want to be “fed” trash and recycling:
Occasionally, people thought the robots expected trash from them and felt obligated to give the robots something. As the robot passed and stopped by the same person for the second time, she said: “I guess it knows I’ve been sitting here long enough, I should give it something.” Some people would even find an excuse to generate trash to “satisfy” and dismiss the trash barrel by searching through a bag or picking rubbish up off the floor.
The earlier paper goes into a bit more detail on what this leads to:
It appears that people naturally attribute intrinsic motivation (or desire to fulfill some need) to the robot’s behavior and that mental model encourages them to interact with the robot in a social way by “feeding” the robot or expecting a social reciprocation of a thank you. Interestingly, the role casted upon the robot by the bystanders is reminiscent of a beggar where it prompts for collections and is expected to be thankful for donations. This contrasts sharply with human analogs such as waitstaff or cleanup janitors where they offer assistance and the receiving bystander is expected to express gratitude.
I wonder how much of this social interaction is dependent on the novelty of meeting the trash barrel robots for the first time, and whether (if these robots were to become full-time staff) humans would start treating them more like janitors. I’m also not sure how well these robots would do if they were autonomous. If part of the magic comes from having a human in the loop to manage what seems like (but probably aren’t) relatively simple human-robot interactions, turning that into effective autonomy could be a real challenge.
Trash Barrel Robots in the City, by Fanjun Bu, Ilan Mandel, Wen-Ying Lee, and Wendy Ju, is presented this week at HRI 2023 in Stockholm, Sweden.
New nuclear looks different, which requires new types of financing. New investment and partnerships are seemingly occurring every day across the industry, including SK Group’s $250million investment into Terrapower, and X-energy’s partnership with Dow Chemical.
What can be done to encourage financial investment and improve the economic viability and the ROI of SMRs? How does new nuclear differ, and how do we finance that?
Reuters Events latest report – Capital Funding, Financing & Economic Viability of SMRs – dives into the vehicles that will assist with advancing financing to support SMRs and advanced reactors deployment and commercialization. What to expect from the report:
This morning, drone-delivery company Zipline announced a new drone-delivery system offering nearly silent, precise delivery that’s intended to expand the company’s capabilities into home delivery. This requires a much different approach from what Zipline has been doing for the past eight years. In order to make home deliveries that are quiet and precise, Zipline has developed a creative new combination of hybrid drones, “droids,” and all the supporting hardware necessary to make deliveries directly to your front porch.
We visited one of Zipline’s distribution centers in Rwanda a few years ago to see how effective their system was at delivering blood across the country’s rugged terrain. To watch a delivery take place, we drove an hour over winding dirt roads to a rural hospital. Shortly after we arrived, a drone made the trip and delivered a package of blood in about 14 minutes. It was a compelling example of the value of drone delivery in situations where you have critical and time-sensitive goods in areas of low infrastructure, but the challenges of urban home delivery are something else entirely.
The way that Zipline’s current generation of fixed-wing delivery drones work is by dropping boxes tethered to small parachutes while flying several tens of meters over an open delivery area. You need some obstacle-free space for this to work reliably (say, a handful of empty parking spaces or the equivalent), and it’s not a particularly gentle process, meaning that there are some constraints on what you can deliver and how it’s packaged. For hospitals and health centers, this is usually no problem. For your home, it very well may not be an option at all.
Zipline’s new drones are much different. In a heavily produced online event featuring the Zipline team alongside Rwandan president Paul Kagame and company board member Bono, Zipline introduced P2, a new delivery system that combines a hybrid fixed-wing drone with a small tethered droid that can drop out of the belly of the drone to make precision deliveries.
Housed within the P2 Zip, the droid and whatever it’s carrying can travel at 112 kilometers per hour through all kinds of weather out to a service radius of about 16 km with an impressive 2.5- to 3.5-kilogram payload. Once the P2 reaches its delivery destination, the Zip hovers at a few hundred feet while an integrated winch lowers the droid and the package down to the ground. The Zip remains at a height that is both safe and quiet while the droid uses integrated thrusters to accurately position itself over the delivery zone, which at just half a meter across could easily be the top of a picnic table. Visual sensors on the droid make sure that the delivery zone is clear. As soon as it touches down, the droid drops its cargo out of its belly. Then it gets winched back up into the Zip, and the team heads back home.
On the other end of things, there’s an integrated loading system where the P2 Zips can be charging outdoors (using an interesting overhead charging system) while the droids drop down a chute to be loaded indoors one by one.
While the event didn’t show a complete delivery cycle, we’re told that all of the hardware is operational and very close to a production design, and that all of the delivery steps have been successfully completed with real aircraft. There’s still a lot of testing to be done, of course, and Zipline expects to have 10,000 flights completed over the summer, with its first deployment to follow. Initial customers include a couple of regional health systems in the United States, Sweetgreen restaurants, and the government of Rwanda, with President Kagame himself as the very first customer. And to be clear, the P2 is not replacing Zipline’s original drone-delivery infrastructure—with their 100-km range, the original Zips (now called the P1) are still keeping quite busy delivering critical goods in Rwanda and elsewhere around the world.
For Synopsys Chief Executive Aart de Geus, running the electronic design automation behemoth is similar to being a bandleader. He brings together the right people, organizes them into a cohesive ensemble, and then leads them in performing their best.
De Geus, who helped found the company in 1986, has some experience with bands. The IEEE Fellow has been playing guitar in blues and jazz bands since he was an engineering student in the late 1970s.
Much like jazz musicians improvising, engineers go with the flow at team meetings, he says: One person comes up with an idea, and another suggests ways to improve it.
“There are actually a lot of commonalities between my music hobby and my other big hobby, Synopsys,” de Geus says.
École Polytechnique Fédérale de Lausanne, Switzerland
Synopsys is now the largest supplier of software that engineers use to design chips, employing about 20,000 people. The company reported US $1.36 billion in revenue in the first quarter of this year.
De Geus is considered a founding father of electronic design automation (EDA), which automates chip design using synthesis and other tools. It was pioneered by him and his team in the 1980s. Synthesis revolutionized digital design by taking the high-level functional description of a circuit and automatically selecting the logic components (gates) and constructing the connections (netlist) to build the circuit. Virtually all large digital chips manufactured today are largely synthesized, using software that de Geus and his team developed.
“Synthesis changed the very nature of how digital chips are designed, moving us from the age of computer-a ided design (CAD) to electronic design automation (EDA),” he says.
During the past three and a half decades, logic synthesis has enabled about a 10 millionfold increase in chip complexity, he says. For that reason, Electrical Business magazine named him one of the 10 most influential executives in 2002, as well as its 2004 CEO of the Year.
Born in Vlaardingen, Netherlands, de Geus grew up mostly in Basel, Switzerland. He earned a master’s degree in electrical engineering in 1978 from the École Polytechnique Fédérale de Lausanne, known as EPFL, in Lausanne.
In the early 1980s, while pursuing a Ph.D. in electrical engineering from Southern Methodist University, in Dallas, de Geus joined General Electric in Research Triangle Park, N.C. There he developed tools to design logic with multiplexers, according to a 2009 oral history conducted by the Computer History Museum. He and a designer friend created gate arrays with a mix of logic gates and multiplexers.
That led to writing the first program for synthesizing circuits optimized for both speed and area, known as SOCRATES. It automatically created blocks of logic from functional descriptions, according to the oral history.
“The problem was [that] all designers coming out of school used Karnaugh maps, [and] knew NAND gates, NOR gates, and inverters,” de Geus explained in the oral history. “They didn’t know multiplexers. So designing with these things was actually difficult.” Karnaugh maps are a method of simplifying Boolean algebra expressions. With NAND and NOR universal logic gates, any Boolean expression can be implemented without using any other gate.
SOCRATES could write a function and 20 minutes later, the program would generate a netlist that named the electronic components in the circuit and the nodes they connected to. By automating the function, de Geus says, “the synthesizer typically created faster circuits that also used fewer gates. That’s a big benefit because fewer is better. Fewer ultimately end up in [a] smaller area on a chip.”
With that technology, circuit designers shifted their focus from gate-level design to designs based on hardware description languages.
Eventually de Geus was promoted to manager of GE’s Advanced Computer-Aided Engineering Group. Then, in 1986, the company decided to leave the semiconductor business. Facing the loss of his job, he decided to launch his own company to continue to enhance synthesis tools.
He and two members of his GE team, David Gregory and Bill Krieger, founded Optimal Solutions in Research Triangle Park. In 1987 the company was renamed Synopsys and moved to Mountain View, Calif.
De Geus says he picked up his management skills and entrepreneurial spirit as a youngster. During summer vacations, he would team up with friends to build forts, soapbox cars, and other projects. He usually was the team leader, he says, the one with plenty of imagination.
“An entrepreneur creates a vision of some crazy but, hopefully, brilliant idea,” he says, laughing. The vision sets the direction for the project, he says, while the entrepreneur’s business side tries to convince others that the idea is realistic enough.
“The notion of why it could be important was sort of there,” he says. “But it is the passion that catalyzes something in people.”
That was true during his fort-building days, he says, and it’s still true today.
“Synthesis changed the very nature of how digital designs are being constructed.”
“If you have a good team, everybody chips in something,” he says. “Before you know it, someone on the team has an even better idea of what we could do or how to do it. Entrepreneurs who start a company often go through thousands of ideas to arrive at a common mission. I’ve had the good fortune to be on a 37-year mission with Synopsys.”
At the company, de Geus sees himself as “the person who makes the team cook. It’s being an orchestrator, a bandleader, or maybe someone who brings out the passion in people who are better in both technology and business. As a team, we can do things that are impossible to do alone and that are patently proven to be impossible in the first place.”
He says a few years ago the company came up with the mantra “Yes, if …” to combat a slowly growing “No, because …” mindset.
“‘Yes, if …’ opens doors, whereas the ‘No, because …’ says, ‘Let me prove that it’s not possible,’” he says. “‘Yes, if … ’ leads us outside the box into ‘It’s got to be possible. There’s got to be a way.’”
De Geus says his industry is going through “extremely challenging times—technically, globally, and business-wise—and the ‘If … ’ part is an acknowledgment of that. I found it remarkable that once a group of people acknowledge [something] is difficult, they become very creative. We’ve managed to get the whole company to embrace ‘Yes, if …’
“It is now in the company’s cultural DNA.”
One of the issues Synopsys is confronted with is the end of Moore’s Law, de Geus says. “But no worries,” he says. “We are facing an unbelievable new era of opportunity, as we have moved from ‘Classic Moore’ scale complexity to ‘SysMoore,’ which unleashes systemic complexity with the same Moore’s Law exponential ambition!”
He says the industry is moving its focus from single chips to multichip modules, with chips closely placed together on top of a larger, “silicon interposer” chip. In some cases, such as for memory, chips are stacked on top of each other.
“How do you make the connectivity between those chips as fast as possible? How can you technically make these pieces work? And then how can you make it economically viable so it is producible, reliable, testable, and verifiable? Challenging, but so powerful,” he says. “Our big challenge is to make it all work together.”
Pursuing engineering was a calling for de Geus. Engineering was the intersection of two things he loved: carrying out a vision and building things. Notwithstanding the recent wave of tech-industry layoffs, he says he believes engineering is a great career.
“Just because a few companies have overhired or are redirecting themselves doesn’t mean that the engineering field is in a downward trend,” he says. “I would argue the opposite, for sure in the electronics and software space, because the vision of ‘smart everything’ requires some very sophisticated capabilities, and it is changing the world!”
During the Moore’s Law era, one’s technical knowledge has had to be deep, de Geus says.
“You became really specialized in simulation or in designing a certain type of process,” he says. “In our field, we need people who are best in class. I like to call them six-Ph.D.-deep engineers. It’s not just schooling deep; it’s schooling and experientially deep. Now, with systemic complexity, we need to bring all these disciplines together; in other words we now need six-Ph.D.-wide engineers too.”
To obtain that type of experience, he recommends university students should get a sense of multiple subdisciplines and then “choose the one that appeals to you.”
“For those who have a clear sense of their own mission, it’s falling in love and finding your passion,” he says. But those who don’t know which field of engineering to pursue should “engage with people you think are fantastic, because they will teach you things such as perseverance, enthusiasm, passion, what excellence is, and make you feel the wonder of collaboration.” Such people, he says, can teach you to “enjoy work instead of just having a job. If work is also your greatest hobby, you’re a very different person.”
De Geus says engineers must take responsibility for more than the technology they create.
“I always liked to say that ‘he or she who has the brains to understand should have the heart to help.’” With the growing challenges the world faces, I now add that they should also have the courage to act,” he says. “What I mean is that we need to look and reach beyond our field, because the complexity of the world needs courageous management to not become the reason for its own destruction.”
He notes that many of today’s complexities are the result of fabulous engineering, but the “side effects—and I am talking about CO2, for example—have not been accounted for yet, and the engineering debt is now due.”
De Geus points to the climate crisis: “It is the single biggest challenge there is. It’s both an engineering and a social challenge. We need to figure out a way to not have to pay the whole debt. Therefore, we need to engineer rapid technical transitions while mitigating the negatives of the equation. Great engineering will be decisive in getting there.”
Not all technological innovation deserves to be called progress. That’s because some advances, despite their conveniences, may not do as much societal advancing, on balance, as advertised. One researcher who stands opposite technology’s cheerleaders is MIT economist Daron Acemoglu. (The “c” in his surname is pronounced like a soft “g.”) IEEE Spectrum spoke with Agemoglu—whose fields of research include labor economics, political economy, and development economics—about his recent work and his take on whether technologies such as artificial intelligence will have a positive or negative net effect on human society.
IEEE Spectrum: In your November 2022 working paper “Automation and the Workforce,” you and your coauthors say that the record is, at best, mixed when AI encounters the job force. What explains the discrepancy between the greater demand for skilled labor and their staffing levels?
Acemoglu: Firms often lay off less-skilled workers and try to increase the employment of skilled workers.
“Generative AI could be used, not for replacing humans, but to be helpful for humans. ... But that’s not the trajectory it’s going in right now.”
—Daron Acemoglu, MIT
In theory, high demand and tight supply are supposed to result in higher prices—in this case, higher salary offers. It stands to reason that, based on this long-accepted principle, firms would think ‘More money, less problems.’
Acemoglu: You may be right to an extent, but... when firms are complaining about skill shortages, a part of it is I think they’re complaining about the general lack of skills among the applicants that they see.
In your 2021 paper “Harms of AI,” you argue if AI remains unregulated, it’s going to cause substantial harm. Could you provide some examples?
Acemoglu: Well, let me give you two examples from Chat GPT, which is all the rage nowadays. ChatGPT could be used for many different things. But the current trajectory of the large language model, epitomized by Chat GPT, is very much focused on the broad automation agenda. ChatGPT tries to impress the users…What it’s trying to do is trying to be as good as humans in a variety of tasks: answering questions, being conversational, writing sonnets, and writing essays. In fact, in a few things, it can be better than humans because writing coherent text is a challenging task and predictive tools of what word should come next, on the basis of the corpus of a lot of data from the Internet, do that fairly well.
The path that GPT3 [the large language model that spawned ChatGPT] is going down is emphasizing automation. And there are already other areas where automation has had a deleterious effect—job losses, inequality, and so forth. If you think about it you will see—or you could argue anyway—that the same architecture could have been used for very different things. Generative AI could be used, not for replacing humans, but to be helpful for humans. If you want to write an article for IEEE Spectrum, you could either go and have ChatGPT write that article for you, or you could use it to curate a reading list for you that might capture things you didn’t know yourself that are relevant to the topic. The question would then be how reliable the different articles on that reading list are. Still, in that capacity, generative AI would be a human complementary tool rather than a human replacement tool. But that’s not the trajectory it’s going in right now.
“Open AI, taking a page from Facebook’s ‘move fast and break things’ code book, just dumped it all out. Is that a good thing?”
—Daron Acemoglu, MIT
Let me give you another example more relevant to the political discourse. Because, again, the ChatGPT architecture is based on just taking information from the Internet that it can get for free. And then, having a centralized structure operated by Open AI, it has a conundrum: If you just take the Internet and use your generative AI tools to form sentences, you could very likely end up with hate speech including racial epithets and misogyny, because the Internet is filled with that. So, how does the ChatGPT deal with that? Well, a bunch of engineers sat down and they developed another set of tools, mostly based on reinforcement learning, that allow them to say, “These words are not going to be spoken.” That’s the conundrum of the centralized model. Either it’s going to spew hateful stuff or somebody has to decide what’s sufficiently hateful. But that is not going to be conducive for any type of trust in political discourse. because it could turn out that three or four engineers—essentially a group of white coats—get to decide what people can hear on social and political issues. I believe hose tools could be used in a more decentralized way, rather than within the auspices of centralized big companies such as Microsoft, Google, Amazon, and Facebook.
Instead of continuing to move fast and break things, innovators should take a more deliberate stance, you say. Are there some definite no-nos that should guide the next steps toward intelligent machines?
Acemoglu: Yes. And again, let me give you an illustration using ChatGPT. They wanted to beat Google [to market, understanding that] some of the technologies were originally developed by Google. And so, they went ahead and released it. It’s now being used by tens of millions of people, but we have no idea what the broader implications of large language models will be if they are used this way, or how they’ll impact journalism, middle school English classes, or what political implications they will have. Google is not my favorite company, but in this instance, I think Google would be much more cautious. They were actually holding back their large language model. But Open AI, taking a page from Facebook’s ‘move fast and break things’ code book, just dumped it all out. Is that a good thing? I don’t know. Open AI has become a multi-billion-dollar company as a result. It was always a part of Microsoft in reality, but now it’s been integrated into Microsoft Bing, while Google lost something like 100 billion dollars in value. So, you see the high-stakes, cutthroat environment we are in and the incentives that that creates. I don’t think we can trust companies to act responsibly here without regulation.
Tech companies have asserted that automation will put humans in a supervisory role instead of just killing all jobs. The robots are on the floor, and the humans are in a back room overseeing the machines’ activities. But who’s to say the back room is not across an ocean instead of on the other side of a wall—a separation that would further enable employers to slash labor costs by offshoring jobs?
Acemoglu: That’s right. I agree with all those statements. I would say, in fact, that’s the usual excuse of some companies engaged in rapid algorithmic automation. It’s a common refrain. But you’re not going to create 100 million jobs of people supervising, providing data, and training to algorithms. The point of providing data and training is that the algorithm can now do the tasks that humans used to do. That’s very different from what I’m calling human complementarity, where the algorithm becomes a tool for humans.
“[Imagine] using AI... for real-time scheduling which might take the form of zero-hour contracts. In other words, I employ you, but I do not commit to providing you any work.”
—Daron Acemoglu, MIT
According to “The Harms of AI,” executives trained to hack away at labor costs have used tech to help, for instance, skirt labor laws that benefit workers. Say, scheduling hourly workers’ shifts so that hardly any ever reach the weekly threshold of hours that would make them eligible for employer-sponsored health insurance coverage and/or overtime pay.
Acemoglu: Yes, I agree with that statement too. Even more important examples would be using AI for monitoring workers, and for real-time scheduling which might take the form of zero-hour contracts. In other words, I employ you, but I do not commit to providing you any work. You’re my employee. I have the right to call you. And when I call you, you’re expected to show up. So, say I’m Starbucks. I’ll call and say ‘Willie, come in at 8am.’ But I don’t have to call you, and if I don’t do it for a week, you don’t make any money that week.
Will the simultaneous spread of AI and the technologies that enable the surveillance state bring about a total absence of privacy and anonymity, as was depicted in the sci-fi film Minority Report?
Acemoglu: Well, I think it has already happened. In China, that’s exactly the situation urban dwellers find themselves in. And in the United States, it’s actually private companies. Google has much more information about you and can constantly monitor you unless you turn off various settings in your phone. It’s also constantly using the data you leave on the Internet, on other apps, or when you use Gmail. So, there is a complete loss of privacy and anonymity. Some people say ‘Oh, that’s not that bad. Those are companies. That’s not the same as the Chinese government.’ But I think it raises a lot of issues that they are using data for individualized, targeted ads. It’s also problematic that they’re selling your data to third parties.
In four years, when my children will be about to graduate from college, how will AI have changed their career options?
Acemoglu: That goes right back to the earlier discussion with ChatGPT. Programs like GPT3and GPT4 may scuttle a lot of careers but without creating huge productivity improvements on their current path. On the other hand, as I mentioned, there are alternative paths that would actually be much better. AI advances are not preordained. It’s not like we know exactly what’s going to happen in the next four years, but it’s about trajectory. The current trajectory is one based on automation. And if that continues, lots of careers will be closed to your children. But if the trajectory goes in a different direction, and becomes human complementary, who knows? Perhaps they may have some very meaningful new occupations open to them.