Wednesday, June 15, 2016

Does ZTE, # 8 in the world even know who Sensory Inc is?

Sure they do....

  • Sensory blog post..
GOOD TECHNOLOGY EXISTS – SO WHY DOES SPEECH RECOGNITION STILL FALL SHORT?
March 30, 2015

At Mobile World Congress, I participated in ZTE's mobile voice panel. ZTE presented data researched in China that basically said people want to use speech recognition on their phones, but they don’t use it because it doesn’t work well enough. I have seen similar data on US mobile phone users, and the automotive industry has also shown data supporting the high level of dissatisfaction with speech recognition.

In fact, when I bought my new car last year I wanted the state of the art in speech recognition to make navigation easier… but sadly I’ve come to learn that the system used in my Lexus just doesn’t work well — even the voice dialing doesn’t work well.

As an industry, I feel we must do better than this, so in this blog I’ll provide my two-cents as to why speech recognition isn’t where it should be today, even when technology that works well exists:

  1. Many core algorithms, especially the ones provided to the automotive industry are just not that good. It’s kind of ironic, but the largest independent supplier of speech technologies actually has one of the worst performing speech engines. Sadly, it’s this engine that gets used by many of the automotive companies, as well as some of the mobile companies.
  2. Even many of the good engines don’t work well in noise. In many tests, Googles speech recognition would come in as tops, but when the environment gets noisy even Google fails. I use my Moto X to voice dial while driving (at least I try to). I also listen to music while driving. The “OK Google Now” trigger works great (kudo’s to Sensory!), but everything I say after that gets lost and I see an “it’s too noisy” message from Google. I end up turning down the radio to voice dial or use Sensory's voice dial app, because Sensory always works… even when it’s noisy!
  3. Speech Application designs are really bad. I was using the recognizer last week on a popular phone. The room was quiet, I had a great internet connection and the recognizer was working great but as a user I was totally confused. I said “set alarm for 4am” and it accurately transcribed “set alarm for 4am” but rather than confirm that the alarm was set for 4am, it asked me what I wanted to do with the alarm. I repeated the command, it accurately transcribed again and asked one more time what I wanted to do with the alarm. Even though it was recognizing correctly it was interfacing so poorly with me that I couldn’t tell what was happening, and it didn’t appear to be doing what I asked it to do. Simple and clear application designs can make all the difference in the world.
  4. Wireless connections are unreliable. This is a HUGE issue. If the recognizer only works when there’s a strong Internet connection, then the recognizer is going to fail A GREAT DEAL of the time. My prediction – over the next couple of years, the speech industry will come to realize that embedded speech recognition offers HUGE advantages over the common cloud based approaches used today – and these advantages exist in not just accuracy and response time, but privacy too!
Deep learning nets have enabled some amazing progress in speech recognition over the last five years. The next five years will see embedded recognition with high performance noise cancelling and beamforming coming to the forefront, and Sensory will be leading this charge… and just like how Sensory led the way with the “always on” low-power trigger, I expect to see Google, Apple, Microsoft, Amazon, Facebook and others follow suit.


So he has a bias, that is the same as QUIKs, but he does say very clearly what he sees. Imagine how the recognition has to be good for Mandarin and others for China.

Since QUIK came out with the hardcode Sensory announced they are working with Philips IP for better noise cancellation and just maybe QUIK had or will use these algos on the Eos.
It is not a bad question to ask at some point, but I just know QUIK will have what they need in this regard.


Any key here I got from reading too much?

Yes, neural networks, have been Sensory Inc approach since day one, their focus from the start was keep it on the device. Neural network are not just lower power by some 20 % or so they
are lower power by an order a magnitude over other approaches. 10X better.

Bob S., ZTE is already primed from their research to add a good voice recognition solution to their flagships and they are 8 in the world.
Can you sign them up?

Thanks in advance.

Looking forward to MWC in :Shanghai.



with devices like the star2 ZTE...

So who sold the voice stuff to ZTE anyway?

Yup, you got it it was Bob S.,

Reuters
Dec 23, 2014 - "For ZTE's newest flagship device the goal was to offer superior voice ... ZTE and Audience are to Advanced Voice," said Robert Schoenfield

Bob can you get the # 8 global OEM to use the Eos? Thanks in advance. Are those familiar China algos ready for MWC SHaghai?

No comments:

Post a Comment