talk-tailor

command module

v0.0.0-...-a072aca Latest Latest Go to latest Published: May 12, 2025 License: MIT Imports: 20 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/icereed/talk-tailor

Links

Open Source Insights

README ¶

TalkTailor

TalkTailor is an open-source, AI-powered web application for transcribing, analyzing, and improving your talks or speeches. It features a modern React frontend and a robust Go backend, leveraging OpenAI's Whisper and GPT models for fast, accurate transcription and advanced text analysis.

Features

🎙️ Record or Upload Audio: Record directly in your browser or upload MP3/video files.
✨ AI-Powered Transcription: Uses OpenAI Whisper for high-quality, multi-language transcription.
📝 Text Correction: Automatic grammar and formatting correction of transcriptions.
🧠 Outline & Bulletpoints: Instantly generate speaker outlines and bulletpoints from your transcript using GPT-4.
💾 Local Storage: Stores your recent transcriptions and audio securely in your browser.
📋 Copy & Edit: Edit, copy, and manage your transcripts with ease.
🚀 Modern UI: Built with React and Ant Design for a seamless user experience.
🐳 Docker Support: Easy deployment with Docker.

Architecture

graph TD
  A[User: Browser] -->|Audio Upload/Record| B(React Frontend)
  B -->|REST API| C(Go Backend)
  C -->|Transcription| D(OpenAI Whisper API)
  C -->|Outline/Bulletpoints| E(OpenAI GPT API)
  B -->|Transcription, Outline, Bulletpoints| A

Demo

Record or Upload: Start a new recording or upload an audio/video file.
Transcribe: Let the AI transcribe and correct your speech.
Analyze: Generate outlines or bulletpoints for your talk.
Edit & Copy: Refine and copy your transcript for further use.

Getting Started

Prerequisites

Go 1.20+
Node.js 18+
npm or yarn
Docker (optional, for containerized deployment)
OpenAI API Key (get one here)

Clone the Repository

git clone https://github.com/icereed/talk-tailor.git
cd talk-tailor

Local Development

1. Backend (Go)

Set your OpenAI API key:

export OPENAI_API_KEY=your-openai-key

Install dependencies and run the server:

go mod tidy
go run main.go

The backend will serve the frontend and expose API endpoints at http://localhost:8080.

2. Frontend (React)

cd client
npm install
npm run build

The production build will be placed in client/dist and served by the Go backend.

Docker Usage

You can run the latest image directly from GitHub Container Registry (GHCR):

docker run -e OPENAI_API_KEY=your-openai-key -p 8080:8080 ghcr.io/icereed/talk-tailor:latest

Or build and run locally:

docker build -t talk-tailor .
docker run -e OPENAI_API_KEY=your-openai-key -p 8080:8080 talk-tailor

Image source: ghcr.io/icereed/talk-tailor

API Reference

`POST /api/transcribe`

Description: Upload an audio file (MP3 or video) to receive a transcription.
Request: multipart/form-data with audio file field.
Response: JSON with original and corrected transcription.

`POST /api/outline`

Description: Generate a detailed speaker outline from transcript text.
Request: JSON { "text": "..." }
Response: JSON { "response": "..." }

`POST /api/bulletpoints`

Description: Convert transcript text into bulletpoints.
Request: JSON { "text": "..." }
Response: JSON { "response": "..." }

Configuration

OPENAI_API_KEY (required): Your OpenAI API key for transcription and text analysis.

Contributing

Contributions are welcome! Please open issues or pull requests for bug fixes, features, or improvements.

Follow standard Go and React/TypeScript best practices.
Ensure code is well-documented and tested.
By contributing, you agree to license your work under the MIT License.

Roadmap

The following roadmap outlines possible directions for community-driven development:

1. Customizability

Custom AI Actions: Define your own AI-powered actions (e.g., "Summarize for LinkedIn", "Extract Q&A", "Generate Quiz") via prompt templates.
Personalized Feedback: Set personal goals (e.g., pacing, filler word reduction) and receive targeted feedback.
UI Themes & Layouts: Dark mode, font size, and customizable dashboard widgets.
Language & Region Settings: Multi-language UI and locale-aware formatting.

2. Advanced AI Enhancements

Speech Analytics: Detect filler words, pacing, pauses, and provide visual analytics. Sentiment/emotion analysis of speech.
Speaker Coaching: AI-generated tips based on transcript analysis (clarity, engagement). Simulated audience Q&A and answer quality rating.
Custom Model Integration: Plug in your own OpenAI-compatible models or endpoints. Support for fine-tuned models for specific domains.

3. Training & Practice Tools

Practice Mode: Timed speaking exercises with real-time feedback.
Progress Tracking: Visualize improvement over time (charts, badges, milestones).
Peer Review: Share recordings/transcripts for community or mentor feedback.
Scenario Library: Pre-built scenarios (e.g., job interview, TED talk) with tailored feedback.

Team Workspaces: Shared libraries for teams, clubs, or classes.
Export Options: Export to PDF, DOCX, slides, or share via link.
API & Plugin System: Third-party integrations (calendar, LMS, video platforms).

5. Privacy & Data Control

On-Premise/Private Mode: Option to run all processing locally or on self-hosted infrastructure.
Granular Data Controls: User control over data retention, sharing, and deletion.

License

This project is licensed under the MIT License.

Acknowledgements

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL