Tts Example App

gauravjha8298

A Next.js-based web application that converts text to speech using third-party APIs. Built with TypeScript, Tailwind CSS, and Docker, this app provides a simple yet powerful interface for users to generate speech from text.

Download

Text-to-Speech Converter

Features

Responsive UI: Built with Tailwind CSS for a modern and mobile-friendly design.
Hybrid Rendering: Combines server-side rendering (SSR) and client-side interactivity.
Third-Party API Integration: Fetches available voices and generates speech using external APIs.
Containerized Deployment: Dockerized for consistent environment setups.
Comprehensive Testing: Includes unit tests (React Testing Library) and end-to-end tests (Playwright).

Architecture Overview

The app follows a Client-Server Architecture with distinct layers and responsibilities.

1. Frontend Layer (Client-Side)

Role: Manages user interactions, renders the UI, and communicates with the backend API.
Key Components:
- TextInput: Text box for user input.
- VoiceSelector: Dropdown to select available voices.
- PlayButton: Button to trigger text-to-speech conversion.
State Management: Local state is managed with React's useState.
API Communication: Axios is used to interact with backend APIs.
Styling: Tailwind CSS ensures a clean and responsive design.

2. Backend Layer (Server-Side)

Role: Processes API requests and interacts with third-party services.
API Routes:
- /api/getVoices: Fetches available voices from a third-party text-to-speech API.
- /api/synthesize: Converts text to speech and returns an audio URL.
Server Rendering: Uses Next.js's App Router for SSR and optimized performance.

3. Third-Party Services (Evenlabs)

Provides external functionality for:
- Fetching a list of available voices.
- Generating speech from user input.

4. DevOps and Deployment Layer

Docker: Containerized with a Dockerfile and managed via docker-compose.
Build System: Uses Turbopack for fast development.
Hosting: Ready for deployment on platforms like Vercel, AWS, or Azure.

Data Flow

User Interaction:
- The user enters text, selects a voice, and clicks the "Play" button.
API Request:
- The frontend sends a request to the backend API (/api/synthesize) with the input text and selected voice.
Third-Party API:
- The backend forwards the request to a third-party API for processing.
Response:
- The backend returns the generated binary audio stream to the frontend.
Audio Playback:
- The frontend plays the audio by creating an audio context and connecting it to a source i.e. speakers.

Getting Started

Prerequisites

Node.js: v18+
Docker: v20+
npm: v8+

Installation

Clone the repository:
Install dependencies:
```
npm install
```
Start the development server:
```
npm run dev
```
Access the app at http://localhost:3000.

Docker Setup

Build the Docker image:
```
docker-compose build
```
Run the container:
```
docker-compose up
```

File Structure

src/
├── app/                # App Router for Next.js pages
├── components/         # Reusable React components
├── lib/                # Custom Libraries & classes (e.g. TTSClient class)
├── pages/              # API routes
├── styles/             # styles
├
tests/
├── __tests__/          # Unit tests
├── e2e/                # End-to-end tests

Testing

Unit Tests

Run the unit tests using Jest and React Testing Library:

npm run test

End-to-End Tests

Run Playwright tests for end-to-end functionality:

npx playwright test

Deployment

Production Build

Build the app for production:
```
npm run build
```
Start the production server:
```
npm start
```

Docker Deployment

Build the Docker image:
```
docker build -t tts-converter .
```
Run the container:
```
docker run -p 3000:3000 tts-converter
```

Top categories