What is Speakr?
Speakr is an open-source and free AI meeting assistant that automates meeting transcription, content summarization, and intelligent Q&A while ensuring absolute data privacy. Speakr runs completely offline with all data processing done locally, eliminating any risk of commercial secrets or sensitive conversations leaking. Users can easily upload audio files in various formats, with transcription and summary generation handled automatically in the background without interrupting user operations. Speakr offers an interactive chat feature that allows users to engage with the transcribed content, ask questions, or search for related information, enhancing the user experience.
Main Features of Speakr
-
Audio Upload and Transcription: Supports multiple audio formats (such as MP3, WAV, M4A, etc.). Users can upload audio files by drag-and-drop or file selection. Transcription runs automatically in the background without blocking the user interface.
-
AI-Powered Summarization and Title Generation: Uses AI technology to generate intelligent meeting summaries and titles, helping users quickly grasp the core content of meetings.
-
Interactive Chat: Users interact with the transcribed content through a chat interface, asking questions or searching for relevant information, such as “List all action items” or “Budget discussion section.”
-
Self-Hosted Security: All data is stored on the user’s own servers, ensuring data security and privacy, preventing leakage of sensitive information.
-
User Management: Supports user registration, login, account management, and recording management. Administrators can manage users and perform system statistics.
-
Multilingual Support: Allows configuration of audio transcription and AI-generated content languages to meet diverse user needs.
-
Search and Highlighting: Supports keyword search and content highlighting for quick location of important information.
-
Metadata Editing: Users can edit metadata related to recordings, such as titles, participants, meeting dates, summaries, and notes.
Technical Principles of Speakr
-
Speech Recognition Technology: Uses OpenAI-compatible speech-to-text (STT) APIs, such as the Whisper model, to convert audio files into text. Users can configure self-hosted Whisper models or other compatible APIs.
-
Natural Language Processing (NLP): AI-driven text summarization and intelligent Q&A technologies generate meeting summaries and titles and enable interactive user engagement through the chat interface.
-
Backend Framework: Built on Python and Flask to handle API requests, data storage, and business logic.
-
Database: Uses SQLAlchemy ORM and SQLite (default) for data storage, managing user information, audio files, and transcriptions.
-
Frontend Technology: Combines Jinja2 templates, Tailwind CSS, and Vue.js to create a smooth user interface.
-
Deployment: Supports Docker and local deployment. Docker enables rapid deployment of the application, while local deployment is suited for development and testing environments.
-
Security Mechanisms: Implements user authentication and data protection via Flask-Login, Flask-Bcrypt, and Flask-WTF, ensuring data security.
Project URL
- GitHub repository: https://github.com/murtaza-nasir/speakr
Application Scenarios of Speakr
-
Internal Corporate Meetings: Quickly generates meeting minutes for internal project and team meetings, ensuring sensitive information stays confidential and facilitating team review and task follow-up.
-
Education: Teachers upload classroom recordings to generate detailed class notes, helping students review material.
-
Remote Collaboration: Records remote team meetings to ensure members quickly understand meeting content, easing task assignment and project management, thereby improving remote collaboration efficiency.
-
Personal Study and Note-Taking: Students or individuals record important meetings or lectures and generate detailed notes for easy review and study, enhancing learning efficiency.
-
Healthcare Industry: Used in medical case discussions and training meeting records, protecting patient information and facilitating future reference.