The Future of Enterprise Search
In this post we go into the motivation and vision behind Caption’s new product and sketch the roadmap.
Search was supposed to be a solved problem, right? The principal algorithms have been around for decades at this stage. Not only did Google revolutionize the field for consumers all over the world, but enterprise players like Elastic, Algolia, and numerous others have made it easy for companies of any size, from Fortune 500 giants to nimble startups, to seamlessly index and search their files. What, then, is the catch?
Motivation and opportunity
The main issue is the textual nature of the main players in the enterprise search space. They were architected with textual documents in mind as the primary type of content within enterprises. While they work well for their intended document types, they leave large swathes of content unhandled.
Audio and video files are the clearest instances. There is currently no simple, out-of-the-box way to index the files and make them searchable. From videos on Coursera and podcasts, to internal training videos during my time at Amazon, there was never a way to find moments when phrases and keywords occur, leading to much annoyance. Caption aims to fill this void.
Caption is a SaaS product allowing users to upload, index, transcribe, and search their audio and video files. It consists of two parts: an easy-to-use dashboard through which to perform the operations, as well as an API that allows users to integrate audio/video search into their own applications. To get a better idea of how it works, check out the tutorial videos we created.
The blockbuster feature is finding timestamps in your files where a certain keyword occurs. All this is made possible thanks to huge advances in machine learning which make transcription, keyword extraction, and such operations easy. Caption currently supports twenty-six languages, with the caveat that the quality varies somewhat between the major and less frequent ones.
There are myriad use cases for audio and video search, but here we highlight a few of the most prominent ones.
- Media: Media organizations and broadcasters have a vast trove of audio and video materials. Their editors and producers frequently struggle to fetch information from older records, and are forced to search only metadata.
- Online education: Audio and video search software is a perfect fit for online education platforms. Their educational videos have high information density, and allowing students to find the exact moments when a certain concept is covered is immensely powerful.
- Podcasts: Podcasts have seen a massive surge in popularity in the past few years. And yet, there is not only a profusion of podcasts as such, but the episodes are long and oftentimes freewheeling. Hence, organizing the data and enabling users to find moments they care the most about is key.
The road ahead
The Caption demo and prototype, we’ve so far circulated in private, have generated tremendous enthusiasm, particularly in the media and online education verticals. This has convinced us to double down on the product and bring our vision to customers around the world. We have several main items on the roadmap.
- SaaS product: We’ll be developing our SaaS product, consisting of the aforementioned dashboard and API. We’ll be adding support for more languages; supporting more sophisticated queries, such as fetching similar video/audio records; keyword extraction; and more ingest options.
- On-premise deployments: While we believe the cloud is the future, we know that customers in many industries are not quite prepared to make the leap yet. So we’ll be providing on-premise deployments, with a variety of ingest options and allowing customers to plug in as many of their own components as they desire, as well as enterprise-level security features.
- A top-secret project: We’ll be launching a sub-product in the upcoming weeks that will showcase our technology. And I believe everyone will find it quite exciting. No hints yet, but stay tuned. 🙃
Marin is the co-founder and CEO of Caption. He was previously the founder of QuickNews, a machine learning-powered news app, and was previously a software Engineer at Amazon, where he worked on S3 and Alexa.