Before we discussed how Amazon Echo was successful, this issue of “AI said seriously†we talked with Chen Xiaoliang, the founder and CEO of Sonic Technology, about the issue of how to build China’s Echo.
Chen Xiaoliang received his Ph.D. from the Institute of Acoustics, Chinese Academy of Sciences, and was an associate researcher (Associate Professor) in the Institute of Acoustics, Chinese Academy of Sciences before starting the business.
(Chen Xiaoliang, founder and CEO of Sonic Technologies)
Why is speech important?
Humans interact with voice for at least tens of millions of years. Speech, as the most natural way for human communication, has more natural advantages than words and images.
If the operation of a smartphone that evolved from a mouse-based desktop to a finger closer to humanity is a big leap forward, a new generation of voice interaction services will be a bigger leap forward.
Imagine how you can accomplish what you want to accomplish through speech. How in accord with the lazy nature of humanity. Maybe you are not used to talking to a hardware, but a new generation of people will naturally adapt to this interaction, everything will be so natural.
Voice interaction is the only way to achieve AR in the future
If we believe that augmented reality equipment (AR) is a future technology trend, then voice is also a necessary interactive mode in the AR era.
Whether the augmented reality device is an always-on device or not, voice arousal and control will be a convenient and interactive way, and other interactive modes such as voice recognition and gesture recognition will make the use of the augmented reality device more flexible (as if the phone is earlier than the TV invention).
In fact, Microsoft's Hololens is already equipped with voice recognition. In the future, voice interaction will become a standard for augmented reality products.
Voice products may bring about the subversion of business models
Speaking of voice products, Xiaoliang Chen believes that the current use of the scene is divided into two kinds, one is near-field voice (phone voice interaction, mouth close to the microphone interaction) this recognition rate of more than 90%, and far-field voice (3-5 The recognition in meters, in-vehicle environment, and smart home environment is actually not high, and many of them have a recognition rate of 50%, so the market experience of voice products is not good.
Amazon’s Echo is a successful example of far-field voice. Amazon Echo products sold more than 6.5 million units in 2016, compared with 1.7 million units in 2015, and it is expected to exceed 10 million units in 2017. Behind the surge in sales is that Echo is rapidly entering the mass market from the niche circle of early adopters. Echo has high hopes and is expected to become the hub of a generation of smart devices.
(Echo series products)
The most panic of Echo's success is Google. Because in the future if people are increasingly accustomed to voice and voice assistance, instead of using traditional search services, search engines will be replaced.
The advertising market represented by search engines will no longer be the mainstream. More likely later, there will be a consumption and service-based transaction model. Consumers only need to pay when they need certain services. This may be a service similar to a truly personalized recommendation, more accurate than a search engine, and with a higher conversion rate.
From a certain point of view, the company that masters voice assistants such as Alexa will be Google in the new era, and it will be an entrance to services and traffic. Such voice-enabled front-end hardware products like Sonic Technologies will also be a billion-dollar market, and the new eco-application layer will have new giants.
Here, history is slightly similar. This reminds the Inspector of the fact that Google, which used advertising as its main business model, launched a free Android mobile phone system and will directly fight against Microsoft, which sells software licenses, in the mobile market. The core reason is that Microsoft has not found a business model that suits itself on the mobile side. This is a relentless crushing of business models against another business model.
Today, Google is facing the same threat. Amazon, which mainly sells goods and deals, launches Echo, the voice product. While the powerful Google has also closely followed Google Home. However, Google Home does not necessarily think how to make money. We understand problem.
Neglected AirPods and Smart Apples
The king of near-field voice is Apple. Although Siri on the iPhone is a tasteless product, Apple has been continuously deploying in the near-field voice. Siri continues to iterate and is integrated into the Apple desktop Mac operating system. Apple apparently will continue to Improve Siri until it finds the right user scenario.
Another important Apple layout for near-field voice is AirPods.
AirPods reminds me of the sci-fi movie "Her", in which the newly lost protagonist Theodore started talking about artificial intelligence. The man has put an artificial intelligence system into a small wireless headset. He wears it everyday to work, take the subway, go shopping, go to the beach, so that he can talk to her at any time and anywhere.
("Her" stills, the man with headphones)
The artificial intelligence system called Samantha is a good solution and has a fascinating voice to help men solve many life problems. AirPods and Siri can also help you solve some simple problems. If there is artificial intelligence in the future, A qualitative leap does not rule out that everyone will have his own "Samantha" and never be afraid of falling out of love.
And Apple has its own clear business model here, selling equipment.
The recent market research firm Slice Intelligence released online sales of the US wireless headset market. In just one month, Apple AirPods has suddenly occupied 26% of the US wireless headset market share.
Do not directly make new hardware
Before Amazon, Google and Microsoft had been researching artificial intelligence technology for a long time. Why would Amazon be surpassed by halfway Amazon in voice interaction?
Amazon began with a combination of software and hardware research. Amazon Echo spent five years developing its own core technology - a microphone array for far-field recognition, and let Echo land on the smart speaker. Demand on the hardware product.
Echo did not directly create a new piece of hardware. Instead, it added voice recognition to existing hardware categories and effectively solved the problem of voice recognition in remote and noisy environments. Google's previous research has been stuck in the interaction of algorithms, deep learning and software. The ground of voice interaction is to solve the speech recognition in the real scene, which requires the distance between the speaker and the machine to be considered, and the voice command of the speaker in any position of the house can be recognized by the machine. This involves the processing of noise, reverberation, echo, and other interference sounds, which cannot be solved by software alone. "A bottleneck in far-field speech interaction technology is acoustics, which is also our core technological advantage," said Chen Xiaoliang.
(AirPods is a big game)
On AirPods, Apple also repeats the same logic as Amazon. AirPods as a Bluetooth headset itself is a functional device, and future voice assistants such as Siri will make AirPods more powerful.
Of course, it is also important that Apple and Amazon have a strong ability to sell, which is Google's disadvantage.
To respect the hardware cycle
Amazon Echo was developed by Amazon's Lab126. They previously launched products such as Kindle and Fire Phone. The Echo project was launched at the end of 2010, and Amazon Echo is not the original name, but Amazon Flash, or even the name on the eve of 2014 delivery.
(Lab126 family bucket)
Amazon Echo spent the entire R&D process for several years, and now thousands of people are improving the product. As an example, Echo's response speed is just 5 seconds at the beginning, then it is pressed to 1.5 seconds, and then it is within 1 second (this is the average response time).
In fact smart speaker products need to solve the problems of acoustic + wake + recognition + control semantic understanding + speech synthesis. The hardware front end includes a microphone array, a noise reduction algorithm, a chip, a hardware platform, and the like, and the cloud includes speech recognition, semantic understanding, and voice data, and music, weather, text messaging, and call applications are provided on the same content.
Chinese awakening and recognition is a big technical challenge. Chinese language mixing and local dialects need continuous optimization. Data accumulation and data labeling also require time and breadth. It also requires optimization of specific scenes (such as navigation). It takes a long time for accumulation and R&D. Even Google, which has a lot of technology accumulation and has strong strength, has not made Google Home for at least two years.
Concerned about more than ten times efficiency improvement scenarios
Artificial intelligence must land on the ground and in the actual scene and product.
For voice products to fall to the ground, they must also bring better efficiency and user experience. Need to find new user scenarios, either to improve existing interactions, or to replace existing interactions.
Historical experience shows that a new way of interaction is to replace another mode of interaction. It must be ten times more efficient. This also explains why Siri is basically a tasteless phone, because the touch screen interaction can meet the needs of users in most scenarios. If you want to play a role on your smartphone, you must find a scenario where the touch screen interaction does not work well.
In the early stage of Sonic Intelligence Technology, it mainly focused on the field of smart audio, and at the same time it gradually expanded customers in the areas of smart security, smart medical care, and robotics. On the basis of acoustic modules, Sonic Technologies also created an integrated audio interactive solution for smart audio, including hardware and cloud services.
Chen Xiaoliang said that he is very optimistic about using voice products in the following areas:
Intelligent hardware: It is very optimistic about the upgrading of the category of traditional hardware, such as smart headphones and smart speakers. Another example is the ability to add voice capabilities to notebooks and TVs. One scenario that can be imagined is that the control of televisions by remote control is much less efficient than direct speech input.
Smart Security: Simply put, you can add microphone arrays to all your cameras to add voice modules.
Smart Healthcare: There are many applications for speech in this area. One example is the electronic medical record. Simply put, a doctor can directly form a medical record by inputting speech in the process of diagnosis. Another example is that some medical tests are performed through sound detection. Directly adding a speech module can complete both interaction and detection, and can eliminate interactions such as screens.
Education: Microphone arrays can be used in multimedia classrooms. Another area of ​​application is remote tutoring.
Smart Toys: Toys with interactive voice function attract children's attention more, but taking into account the cost of toys and children's habits, the single wheat identification algorithm is currently more suitable programs, such as 360 children's robots, 360 story machines, etc.
The automotive market: In the hands and glasses are occupied (no hands no eyes) conditions, voice is the best way to interact, voice products on the car is a battleground.
Voice will become an important interactive method for the next generation of smart devices. It is a definite thing. With the progress of the industry, higher quality voice products will enter into all aspects of life at a lower cost, bringing more convenience to our lives and work.
N1 Open Vape Pod System,E-Cigarette Aluminum Kit,Disposable E-Cigarette Pen,Wholesale Vape Pen
Shenzhen Niimoo Innovative Technology Co., Ltd , https://www.niimootech.com