Sunday, December 25, 2016


  1. I consider my QUIK stock 10% Sensory. Have for a while. Makes me feel diversified :)

    Because of the hardcode LPSD, etc. They can do well together.
    How many people have a sensory Inc LPSD hardwired with an MCU above it on an SoC?

    there will be more.....
  2. jfieb

    jfiebWell-Known Member


    A Blast from the past....'11

    Shows just how ahead of their time Sensory was. Just amazing. Use this for what we are about to see at CES!

    THE HOLY GRAIL IN SPEECH IS ALMOST HERE!
    May 6, 2011

    For far too long, speech recognition just hasn’t worked well enough to be usable for everyday purposes. Even simple command and control by voice had been barely functional and unreliable…but times, they are a changing! Today speech recognition works quite well and is widely used in computer and smart phone applications…and I believe we are rapidly converging on the Holy Grail of Speech – making a recognition and response system that can be virtually indistinguishable from a human (a really smart human with immaculate spelling skills and fluency in many languages!)

    I think there are 4 important components to what I’d call the Holy Grail in Speech:

    1. No Buttons Necessary. OK here I’m tooting my own whistle, but Sensory has really done something amazing in this area. For the first time in history there is a technology that can be always-on and always-listening, and it consistently works when you call out to it and VERY rarely false-fires in noise and conversation! This just didn’t exist before Sensory introduced the Truly Handsfree™ Voice Control, and it is a critical part of a human-like system. Users don’t want to have to learn how to use a device, Open Apps, and hold talk buttons to use! People just want to talk naturally, like we do to each other! This technology is HERE NOW and gaining traction VERY rapidly.
    2. Natural Language Interactions. This is a bit tricky, because it goes way beyond just speech recognition; there has to be “meaning recognition”. Today, many of the applications running on smart phones allow you to just say what you want. I use SIRI (Nuance), Google and Vlingo pretty regularly, and they are all very good. But what’s impressive to me isn’t just how good they are, it’s the rate at which they seem to be improving. Both the recognition accuracy and the understanding of intent seem to be gaining ground very rapidly.
      I just did a fun test…I asked each engine (in my nice quiet office) “How many legs does an insect have?”…and all three interpreted my request perfectly. Google and Vlingo called up the right website with the question and answer…and SIRI came back with the answer – six! Pretty nice! My guess is the speech recognition is still a bit ahead of the “meaning recognition”…
      Just tried another experiment. I asked “Where can I celebrate Cinco de Mayo?” SIRI was smart enough to know I wanted a location, but tried to send me off to Sacramento (sorry – too far away for a margarita!) Vlingo and Google both rely on Google search, and did a general search which didn’t seem to associate my location… (one of them mis-recognized, but not so badly that they didn’t spit out identical results!) Anyways, I’d say we are close in this category, but this is where the biggest challenge lies.
    3. Accurate Translation and Transcription. I suppose this is ultimately important in achieving the Holy Grail. I don’t do much of this myself, but it’s an important component to Item 2 above, and also necessary for dictating emails and text messages. When I last tested Nuance’s Dragon Dictate I was blown away by how well it performed. It’s probably the Nuance engine used inApple’s Siri (you know, Nuance has a lot of engines to choose from!), and it’s really quite good. I think Nuance is a step ahead in this area.
    4. Human Sounding TTS. The TTS (text-to-speech) technology in use today is quite remarkable. There are really good sounding engines from ATT, Nuance, Acapela, Neospeech, SVOX, Ivona, Loquendo and probably others! They are not quite “human”, but come very close. As more data gets thrown at unit selection (yes, size will not matter in the future!), they will essentially become intelligently spliced-together recordings that are indistinguishable from live performance.
    Anyways, reputable companies are starting to combine and market these kinds of functions today, and I’d guess it’s a just a matter of five to ten years until you can have a conversation with a computer or smartphone that’s so good, it is difficult to tell whether it’s a live person or not!
  3. jfieb

    jfiebWell-Known Member


    To my good friend RC, who is new here ( and happy).
    Thanks for today. -).
    BFAST after CES?

    Here is what I want to put up for ALL to review. Expect that PEEL will want to highlight its new voice capability.
    Voice will BE THE key thing at CES ( Huffinton Post item )
    Many of the players are GLOBAL.


    Sensory Inc. Speech Recognition Solutions for Consumer Products Support Language Capabilities Across the Globe
    Santa Clara, CA – July 16, 2014 World’s Most Highly Spoken Language Mandarin Chinese Among Languages Also Supported by Sensory’s Speaker Verification Technology.

    Sensory Inc., the industry leader in speech and vision technologies for consumer products, is pleased to announce that it supports a wide range of languages across 41 countries all over the world with its innovative speech recognition solutions.

    Compare to Facebook who wants 70 languages in that job opening.
    So PEEL can roll across its global footprint. America, Asia, South America.


    Languages currently supported by Sensory’s speaker recognition technologies include the world’s three most highly spoken-Mandarin (2 billion speakers) Spanish (406 million speakers) and English (335 million speakers).Other languages developed for Sensory’s platforms include: French, German, Italian, Japanese, Korean, Dutch, Russian, Arabic, Turkish, Swedish, and Portuguese. Nearly all of the languages available in Sensory’s speech recognition solutions are also supported in its speaker verification technologies, including Mandarin Chinese.

    Among its products in multiple languages is Sensory’s landmark TrulyHandsfree™, the leading always-on, always-listening voice control solution for consumer electronics. The introduction of the TrulyHandsfree™Voice Control technology revolutionized the speech technology industry for a wide variety of hands-free consumer applications. With an extremely noise robust and accurate solution that responds quickly and at ultra-low power consumption, theTrulyHandsfree™ trigger technology has become the most widely adopted keyword spotting technology in the speech industry.

    Sensory’s staff of world-renowned speech experts and linguists is continuing to expand the company’s language and country support. Languages currently in development include Indian English, Polish, Greek, and Cantonese, with others soon to be added.


    So this is from 14, expect that thy have added many of the above, but NOT Scottish. ( a joke for the real GEEKs)


    “We are committed to providing the most innovative global speech and speaker solutions for deployment in consumer electronic applications,” stated Sensory CEO Todd Mozer. “Theterm ‘world-class’ truly defines our technology for the diversity of languages and international regions that it supports, and our continued investment in resources to develop and expand these language offerings.”



    So as you think over the LPSD, and the QUIK Eos understand that a major player that has an APP across all the continents...will have it all covered with the QUIK/Sensory solution. So this is SO very important. If they use the Eos they can have a global menu of phrases.
    In this case the adjacent possible is just what a global brand needs/

    Facebook want a team for 70 languages for their platform, as TM wrote about in 11.

    The adjacent possible does not have to be a 5 x10 broom closet, but it can also be a well lit stage where great performances will occur.

    The Holy Grail is almost here....

    1. No Buttons Necessary. OK here I’m tooting my own whistle, but Sensory has really done something amazing in this area. For the first time in history there is a technology that can be always-on and always-listening, and it consistently works when you call out to it and VERY rarely false-fires in noise and conversation! This just didn’t exist before Sensory introduced the Truly Handsfree™ Voice Control, and it is a critical part of a human-like system. Users don’t want to have to learn how to use a device, Open Apps, and hold talk buttons to use! People just want to talk naturally, like we do to each other! This technology is HERE NOW and gaining traction VERY rapidly.

  4. jfieb

    jfiebWell-Known Member


    Sensory Inc in Phonetics journal on Indian English



    The effects of native language on Indian English sounds and timing patterns

    Article in Journal of Phonetics 41(6):393-406 · November 2013 with 94 Reads
    DOI: 10.1016/j.wocn.2013.07.004

    Abstract
    This study explored whether the sound structure of Indian English (IE) varies with the divergent native languages of its speakers or whether it is similar regardless of speakers' native languages. Native Hindi (Indo-Aryan) and Telugu (Dravidian) speakers produced comparable phrases in IE and in their native languages. Naïve and experienced IE listeners were then asked to judge whether different sentences had been spoken by speakers with the same or different native language backgrounds. The findings were an interaction between listener experience and speaker background such that only experienced listeners appropriately distinguished IE sentences produced by speakers with different native language backgrounds. Naïve listeners were nonetheless very good at distinguishing between Hindi and Telugu phrases. Acoustic measurements on monophthongal vowels, select obstruent consonants, and suprasegmental temporal patterns all differentiated between Hindi and Telugu, but only 3 of the measures distinguished between IE produced by speakers of the different native languages. The overall results are largely consistent with the idea that IE has a target phonology that is distinct from the phonology of native Indian languages. The subtle L1 effects on IE may reflect either the incomplete acquisition of the target phonology or, more plausibly, the influence of sociolinguistic factors on the use and evolution of IE.

    Sensory Inc. and its CEO who stuck to his vision for 20 yrs deserves ALL the limelight they got in 16 and I expect TM will just be TOO Busy to write many blogs anymore. Just too busy to write, he does NOT have to evangilize any more, he is writing contracts as fast as they can work out the details.......

    Maybe they will finally get to Scottish in '17? ;-)

    ANd here is an important TM snip on the BIG dogs...

    but they have grown their products to a very usable accuracy level, through deep learning, but lost much of the advantages of small footprint and low power in the process.

    Facebook needs Sensory Inc.?

No comments:

Post a Comment