On November 22nd, Baidu’s chief scientist Wu Enda introduced Baidu’s four voice technologies when it talked about the latest developments in Baidu’s voice technology, and announced that it would be free to users and developers.
In March of this year, Google released a new machine learning platform for developers at the Next Cloud Computing Conference, and opened a voice recognition API, namely Google Voice Search and voice input support technology. The Google Cloud SPeechAPI will be available for free at the beginning and will be charged later. This application includes more than 80 languages ​​for a variety of real-time speech recognition and translation applications.
Behind the opening, Internet companies hope to promote the further evolution of intelligent voice models and the rapid spread of intelligent voice technology.
For the four voice technologies announced by Baidu, Wu Enda said that Baidu has no plans for charging. These technologies are designed to solve some of the key problems that users are generally troubled when using voice interaction. "The current speech recognition has exceeded the voice recognition capabilities of normal people."
Taking emotional synthesis as an example, it is mainly based on deep learning and big data processing technology, and carries out a series of innovations in data acquisition, processing, modeling and other aspects to achieve a more expressive natural reading effect.
Reader Jin Dashi, the general manager of Gansu Digital Technology Co., Ltd., told reporters that the current "Reader Digital Farm Bookstore" is being piloted in Qingyang City, Gansu Province. According to Baidu Big Data, the emotional voice is used to realize the audio reading of books, so that many illiterate elderly and left-behind children I also enjoyed reading.
The far-field solution technology is a far-field recognition technology independently developed by Baidu. Based on the microphone array, the high-accuracy far-field recognition is realized by using microphone array beam formation, speech enhancement, echo cancellation, and sound source localization.
Baidu said that developers can use this new technology interface to increase the speech recognition distance to 3 to 5 meters, and the device's voice wake-up rate is increased to over 95%, or to solve the problem of long-term speech recognition accuracy. This will bring more imagination to voice technology than it is now, not just remote control or unlocking the phone.
For example, Baidu’s “small robot human-machine voice interactive ordering meal†put into use at the KFC flagship store in Shanghai can answer orders at any distance.
"We are already at the dawn of artificial intelligence." Wu Enda is optimistic about the media, and he hopes that by opening up artificial intelligence technology, everyone can more easily develop "smart applications."
However, it may take time for the artificial intelligence standing at "Dawn" to have a "quality" leap. One detail is that in the conference room interviewed by reporters, Wu Enda is still sitting in a short-distance person who organizes text in real time. Speech recognition is an important part of AI, intelligent voice triggers giant melee
Intelligent Voice: exceeded normal human ability to identify <br> <br> In fact, an open speech API (Application Programming Interface) has become the industry trends. In March of this year, Google released a new machine learning platform for developers at the Next Cloud Computing Conference, and opened a voice recognition API, namely Google Voice Search and voice input support technology. The Google Cloud SPeechAPI will be available for free at the beginning and will be charged later. This application includes more than 80 languages ​​for a variety of real-time speech recognition and translation applications.
Behind the opening, Internet companies hope to promote the further evolution of intelligent voice models and the rapid spread of intelligent voice technology.
For the four voice technologies announced by Baidu, Wu Enda said that Baidu has no plans for charging. These technologies are designed to solve some of the key problems that users are generally troubled when using voice interaction. "The current speech recognition has exceeded the voice recognition capabilities of normal people."
Taking emotional synthesis as an example, it is mainly based on deep learning and big data processing technology, and carries out a series of innovations in data acquisition, processing, modeling and other aspects to achieve a more expressive natural reading effect.
Reader Jin Dashi, the general manager of Gansu Digital Technology Co., Ltd., told reporters that the current "Reader Digital Farm Bookstore" is being piloted in Qingyang City, Gansu Province. According to Baidu Big Data, the emotional voice is used to realize the audio reading of books, so that many illiterate elderly and left-behind children I also enjoyed reading.
The far-field solution technology is a far-field recognition technology independently developed by Baidu. Based on the microphone array, the high-accuracy far-field recognition is realized by using microphone array beam formation, speech enhancement, echo cancellation, and sound source localization.
Baidu said that developers can use this new technology interface to increase the speech recognition distance to 3 to 5 meters, and the device's voice wake-up rate is increased to over 95%, or to solve the problem of long-term speech recognition accuracy. This will bring more imagination to voice technology than it is now, not just remote control or unlocking the phone.
For example, Baidu’s “small robot human-machine voice interactive ordering meal†put into use at the KFC flagship store in Shanghai can answer orders at any distance.
Usb Wireless Charger,Best Wireless Charger,Wireless Charger Price,Wireless Mobile Charger
wzc , https://www.dg-wzc.com