About the Book
PART 1 - Voice System FoundationsChapter 1: Say Hello to Voice SystemsChapter goal: Introduce the reader to voice-first technology, its core concepts, and typical phases of development through an explanatory background for the current state and challenges of voice.No of pages - 20
Sub-topics
1. Voice-first, voice-only, and conversational everything2. Introduction to voice technology components (Speech to text, Natural languageunderstanding, Dialog management, Natural language generation, Text to speech)3. The phases of voice development success (Plan, Design, Build, Test, Deploy &Assess, Iterate)4. Hope is not a strategy - but to plan & execute is
Chapter 2: Keeping Voice in MindChapter goal: Explain to the reader how humans and computers "talk" and "listen."What's easy and hard for the human user and the technology in a dialog, and why.No of pages - 15Sub-topics1. Why voice is different2. Hands-on: A pre-coding thought experiment3. Voice dialog and its participants- The Human: spoken natural language understanding- The Computer: voice system recognition and interpretation- Human-computer voice dialog - Successful voice-first development is all aboutcoordinating human abilities with the technology to allow conversations betweentwo very different dialog participants.
Chapter 3: Running a Voice Implementation-and Noticing IssuesChapter goal: Allow the reader to put into practice their newly learned foundation byimplementing and running a simple voice application in the Google Assistant framework, and experiencing how quickly even a simple voice interaction needs improvement.No of pages - 15Sub-topics1. Hands-on: Preparing a restaurant finder2. Introducing voice platforms3. Hands-on: Implementing the restaurant finderBasic setup, Specifying a first intent, Doing something, What the user says, What the VUI says, Connecting Dialogflow to Actions on Google, Testingthe app, Saving the voice interaction4. Google's voice development ecosystem, and why we're using it here5. The pros and cons of relying on tools6. Hands-on: Making changes - testing and iterating (Adding phrases to handle the same meaning, additional content, and more specific)
PART 2 - Planning Voice System InteractionsChapter 4: Defining your Vision: Building What, How, and Why for WhomChapter goal: Introduce voice-focused requirement discovery, highlighting differencesfrom other modalities and devices and showingNo of pages - 25Sub-topics1. Functional requirements: What are you building? (General and detailed functionality)2. Non-functional business requirements: Why are you building it? (Purpose, underlyingservice and existing automation, branding and terminology, data needs, access andavailability, legal and business constraints)3. Non-functional user requirements: Who will use it and what do they want? (Userpopulation demographics and characteristics, engagement patterns, mental modelsand domain knowledge, environment and state of mind)4. Non-functional system requirements; How will you build it? (Available options forrecognizer, parser, and interpreter, external data sources, data storage and data access, other system concerns)
Chapter 5: From Discovery to UX and UI Design: Tools of the Voice-First TradeChapter goal: Show how to turn discovery findings into high-level architectural designs, using flows diagrams, sample di
About the Author: Ann Thymé-Gobbel's career has focused on how people use speech and natural language to communicate with each other and with technology. After completing her PhD in cognitive science and linguistics from UC San Diego, she's held a broad set of voice-related UI/UX design roles in both large corporations and small start-ups, working with diverse teams in product development, client project engagements, and R&D. Her past work includes design, data analysis and establishing best practices at Nuance, voice design for mobile and in-home devices at Amazon Lab 126, and creating natural language conversations for multimodal healthcare apps at 22otters. Her research has covered automatic language detection, error correction, and discourse structure. She is currently Director of UI/UX Design at Loose Cannon Systems, the team bringing to market Milo, a handsfree wearable communicator. Ann never stops doing research: she collects and analyzes data at every opportunity and enjoys sharing her findings with others, having presented and taught at conferences internationally.
Charles Jankowski has over 30 years' experience in industry and academia developing applications and algorithms for real-world users incorporating advanced speech recognition, speaker verification, and natural language technologies. He has used state-of-the-art machine learning processes and techniques for data analysis, performance optimization, and algorithm development. Charles has highly in-depth technical experience with state-of-the-art technologies, effective management of cross-functional teams for all facets of application deployment, and outstanding relationships with clients. Currently, he is Director of NLP at Brain Technologies, creating the Natural iOS application with which you can "Say it and Get it." Previously he was Director of NLP and Robotics at CloudMinds, Director of Speech and Natural Language at 22otters, Senior Speech Scientist at Performance Technology Partners, and Director of Professional Services at Nuance. He has also been an independent consultant. Charles holds S.B., S.M., and Ph.D. degrees from MIT, all in electrical engineering.