Skip to content

Replace Hardcoded Values with Enums - Improve Code Maintainability #265

@san0808

Description

@san0808

Overview

I was checking out the openai_llm code where i found the models supported for json_format to be the older models and that led to me testing out lot of things on the file to see that the Bolna codebase currently contains numerous hardcoded string literals and magic values that make the code harder to maintain, more error-prone, and difficult to extend. This proposal outlines a comprehensive refactoring to replace these with proper enums and constants.

Why This Matters

  • Type Safety: Enums provide compile-time checking and IDE autocompletion
  • Maintainability: Centralized constants make changes easier
  • Documentation: Enums serve as living documentation of valid values
  • Error Prevention: Reduces typos and invalid value usage
  • Extensibility: Easy to add new providers/formats/types

Identified Hardcoded Values

🔴 HIGH PRIORITY - Core Infrastructure

1. Provider Names

Current Issues:

File: bolna/providers.py (Lines 7-50)

SUPPORTED_SYNTHESIZER_MODELS = {
    'polly': PollySynthesizer,
    'elevenlabs': ElevenlabsSynthesizer,
    'openai': OPENAISynthesizer,
    'deepgram': DeepgramSynthesizer,
    'azuretts': AzureSynthesizer,
    'cartesia': CartesiaSynthesizer,
    'smallest': SmallestSynthesizer,
    'sarvam': SarvamSynthesizer,
    'rime': RimeSynthesizer
}

SUPPORTED_LLM_PROVIDERS = {
    'openai': OpenAiLLM,
    'cohere': LiteLLM,
    'ollama': LiteLLM,
    'deepinfra': LiteLLM,
    'together': LiteLLM,
    'fireworks': LiteLLM,
    'azure-openai': LiteLLM,
    'perplexity': LiteLLM,
    'vllm': LiteLLM,
    'anyscale': LiteLLM,
    'custom': OpenAiLLM,
    'ola': OpenAiLLM,
    'groq': LiteLLM,
    'anthropic': LiteLLM,
    'deepseek': LiteLLM,
    'openrouter': LiteLLM,
    'azure': LiteLLM
}

File: bolna/models.py (Line 102)

@field_validator("provider")
def validate_model(cls, value):
    return validate_attribute(value, ["polly", "elevenlabs", "openai", "deepgram", "azuretts", "cartesia", "smallest", "sarvam", "rime"])

File: bolna/models.py (Line 89)

@field_validator("provider")
def validate_model(cls, value):
    return validate_attribute(value, list(SUPPORTED_TRANSCRIBER_PROVIDERS.keys()))

Proposed Solution:

from enum import Enum

class SynthesizerProvider(str, Enum):
    POLLY = "polly"
    ELEVENLABS = "elevenlabs"
    OPENAI = "openai"
    DEEPGRAM = "deepgram"
    AZURE_TTS = "azuretts"
    CARTESIA = "cartesia"
    SMALLEST = "smallest"
    SARVAM = "sarvam"
    RIME = "rime"

class TranscriberProvider(str, Enum):
    DEEPGRAM = "deepgram"
    WHISPER = "whisper"
    AZURE = "azure"
    ASSEMBLY_AI = "assemblyai"

class LLMProvider(str, Enum):
    OPENAI = "openai"
    COHERE = "cohere"
    OLLAMA = "ollama"
    ANTHROPIC = "anthropic"
    GROQ = "groq"
    # ... etc

Additional Files Affected:

  • bolna/synthesizer/polly_synthesizer.py
  • bolna/synthesizer/elevenlabs_synthesizer.py
  • bolna/synthesizer/openai_synthesizer.py
  • bolna/synthesizer/deepgram_synthesizer.py
  • bolna/synthesizer/azure_synthesizer.py
  • bolna/synthesizer/cartesia_synthesizer.py
  • bolna/synthesizer/smallest_synthesizer.py
  • bolna/synthesizer/sarvam_synthesizer.py
  • bolna/synthesizer/rime_synthesizer.py
  • bolna/transcriber/deepgram_transcriber.py
  • bolna/transcriber/whisper_transcriber.py
  • bolna/transcriber/azure_transcriber.py
  • bolna/transcriber/assemblyai_transcriber.py

2. Audio/Media Formats

Current Issues:

File: bolna/models.py (Line 79)

class Transcriber(BaseModel):
    provider: str
    encoding: Optional[str] = "linear16"
    language: Optional[str] = "en"
    model: Optional[str] = None
    stream: bool = True

File: bolna/models.py (Line 97)

class Synthesizer(BaseModel):
    provider: str
    provider_config: Union[PollyConfig, ElevenLabsConfig, AzureConfig, RimeConfig, SmallestConfig, SarvamConfig, CartesiaConfig, DeepgramConfig, OpenAIConfig] = Field(union_mode='smart')
    stream: bool = False
    buffer_size: Optional[int] = 40  # 40 characters in a buffer
    audio_format: Optional[str] = "pcm"
    caching: Optional[bool] = True

File: bolna/models.py (Line 108)

class IOModel(BaseModel):
    provider: str
    format: Optional[str] = "wav"

File: bolna/assistant.py (Lines 17, 22)

tools_config_args['input'] = {
    "format": "wav",
    "provider": "default"
}

tools_config_args['output'] = {
    "format": "wav", 
    "provider": "default"
}

File: bolna/synthesizer/openai_synthesizer.py (Line 17)

def get_format(self, format):
    return "mp3"

Proposed Solution:

class AudioFormat(str, Enum):
    WAV = "wav"
    PCM = "pcm"
    MP3 = "mp3"
    FLAC = "flac"

class AudioEncoding(str, Enum):
    LINEAR16 = "linear16"
    MULAW = "mulaw"
    ALAW = "alaw"

class ResponseFormat(str, Enum):
    TEXT = "text"
    JSON_OBJECT = "json_object"

Additional Files Affected:

  • bolna/synthesizer/base_synthesizer.py
  • bolna/transcriber/base_transcriber.py
  • bolna/output_handlers/telephony_providers/twilio.py
  • bolna/output_handlers/telephony.py
  • bolna/helpers/utils.py
  • bolna/agent_manager/task_manager.py

3. Task Types

Current Issues:

File: bolna/models.py (Line 334)

class Task(BaseModel):
    tools_config: ToolsConfig
    toolchain: ToolsChainModel
    task_type: Optional[str] = "conversation"  # extraction, summarization, notification
    task_config: ConversationConfig = dict()

File: bolna/agent_manager/task_manager.py (Lines 726-737)

def __setup_tasks(self, llm=None, agent_type=None, assistant_config=None):
    if self.task_config["task_type"] == "conversation" and not self.__is_multiagent():
        self.tools["llm_agent"] = self.__get_agent_object(llm, agent_type, assistant_config)
    elif self.__is_multiagent():
        return self.__get_agent_object(llm, agent_type, assistant_config)
    elif self.task_config["task_type"] == "extraction":
        logger.info("Setting up extraction agent")
        self.tools["llm_agent"] = ExtractionContextualAgent(llm, prompt=self.system_prompt)
        self.extracted_data = None
    elif self.task_config["task_type"] == "summarization":
        logger.info("Setting up summarization agent")
        self.tools["llm_agent"] = SummarizationContextualAgent(llm, prompt=self.system_prompt)
        self.summarized_data = None

File: bolna/agent_manager/task_manager.py (Line 755)

async def load_prompt(self, assistant_name, task_id, local, **kwargs):
    if self.task_config["task_type"] == "webhook":
        return

File: local_setup/quickstart_server.py (Line 66)

if task['task_type'] == "extraction":
    extraction_prompt_llm = os.getenv("EXTRACTION_PROMPT_GENERATION_MODEL")
    extraction_prompt_generation_llm = LiteLLM(model=extraction_prompt_llm, max_tokens=2000)

Proposed Solution:

class TaskType(str, Enum):
    CONVERSATION = "conversation"
    EXTRACTION = "extraction"
    SUMMARIZATION = "summarization"
    NOTIFICATION = "notification"
    WEBHOOK = "webhook"

Files Affected:

  • bolna/models.py (Line 334)
  • bolna/agent_manager/task_manager.py (Lines 726, 730, 734, 755)
  • local_setup/quickstart_server.py (Line 66)

🟡 MEDIUM PRIORITY - Feature Enhancement

4. Pipeline Components

Current Issues:

File: bolna/assistant.py (Lines 26-36)

if transcriber is None:
    pipelines.append(["llm"])
    tools_config_args['transcriber'] = transcriber

pipeline = ["transcriber", "llm"]
if synthesizer is not None:
    pipeline.append("synthesizer") 
    tools_config_args["synthesizer"] = synthesizer
pipelines.append(pipeline)

if enable_textual_input:
    pipelines.append(["llm"])

File: bolna/models.py (Line 302)

class ToolsChainModel(BaseModel):
    execution: str = "parallel"
    pipelines: List[List[str]]

Proposed Solution:

class PipelineComponent(str, Enum):
    TRANSCRIBER = "transcriber"
    LLM = "llm"
    SYNTHESIZER = "synthesizer"
    INPUT = "input"
    OUTPUT = "output"

5. Agent Types

Current Issues:

File: bolna/models.py (Line 340)

class AgentModel(BaseModel):
    agent_name: str
    agent_type: str = "other"

File: API.md (Lines 45-52)

"llm_agent": {
    "agent_type": "simple_llm_agent",
    "agent_flow_type": "streaming",
    "routes": null,
    "llm_config": {
        "agent_flow_type": "streaming",
        "provider": "openai",
        "request_json": true,
        "model": "gpt-4o-mini"
    }
}

Proposed Solution:

class AgentType(str, Enum):
    SIMPLE_LLM = "simple_llm_agent"
    CONVERSATIONAL = "conversational_agent"
    EXTRACTION = "extraction_agent"
    GRAPH_BASED = "graph_based_agent"
    OTHER = "other"

6. OpenAI Models (Fix for Compatibility Issue)

Current Issues:

File: bolna/llms/openai_llm.py (Line 199)

def get_response_format(self, is_json_format: bool):
    if is_json_format and self.model in ('gpt-4-1106-preview', 'gpt-3.5-turbo-1106', 'gpt-4o-mini'):
        return {"type": "json_object"}
    else:
        return {"type": "text"}

File: bolna/llms/openai_llm.py (Line 17)

def __init__(self, max_tokens=100, buffer_size=40, model="gpt-3.5-turbo-16k", temperature=0.1, language=DEFAULT_LANGUAGE_CODE, **kwargs):

File: bolna/models.py (Line 135)

class MongoDBProviderConfig(BaseModel):
    connection_string: Optional[str] = None
    db_name: Optional[str] = None
    collection_name: Optional[str] = None
    index_name: Optional[str] = None
    llm_model: Optional[str] = "gpt-3.5-turbo"
    embedding_model: Optional[str] = "text-embedding-3-small"
    embedding_dimensions: Optional[int] = 256

Proposed Solution:

class OpenAIModel(str, Enum):
    GPT_35_TURBO = "gpt-3.5-turbo"
    GPT_35_TURBO_1106 = "gpt-3.5-turbo-1106"
    GPT_4_1106_PREVIEW = "gpt-4-1106-preview"
    GPT_4O_MINI = "gpt-4o-mini"
    GPT_4O = "gpt-4o"
    GPT_4_TURBO = "gpt-4-turbo"
    GPT_41_NANO = "gpt-4.1-nano"  # New model support

class OpenAICapability(str, Enum):
    JSON_MODE = "json_mode"
    FUNCTION_CALLING = "function_calling"
    STREAMING = "streaming"

# Model capabilities mapping
OPENAI_MODEL_CAPABILITIES = {
    OpenAIModel.GPT_35_TURBO_1106: [OpenAICapability.JSON_MODE, OpenAICapability.STREAMING],
    OpenAIModel.GPT_4_1106_PREVIEW: [OpenAICapability.JSON_MODE, OpenAICapability.FUNCTION_CALLING],
    OpenAIModel.GPT_4O_MINI: [OpenAICapability.JSON_MODE, OpenAICapability.STREAMING],
    OpenAIModel.GPT_4O: [OpenAICapability.JSON_MODE, OpenAICapability.FUNCTION_CALLING],
    OpenAIModel.GPT_41_NANO: [OpenAICapability.JSON_MODE, OpenAICapability.STREAMING],
}

🟢 LOWER PRIORITY - Nice to Have

7. Status/State Values

Current Issues:

File: local_setup/quickstart_server.py (Line 59)

data_for_db["assistant_status"] = "seeding"

File: local_setup/quickstart_server.py (Line 82)

return {"agent_id": agent_uuid, "state": "created"}

File: API.md (Line 101)

{
    "agent_id": "uuid-string",
    "state": "created"
}

Proposed Solution:

class AgentStatus(str, Enum):
    CREATED = "created"
    SEEDING = "seeding"
    ACTIVE = "active"
    COMPLETED = "completed"
    FAILED = "failed"
    DELETED = "deleted"

8. Message/Data Types

Current Issues:

File: bolna/output_handlers/default.py (Lines 59-82)

if packet["meta_info"]['type'] in ('audio', 'text'):
    if packet["meta_info"]['type'] == 'audio':
        logger.info(f"Sending audio")
        data = base64.b64encode(packet['data']).decode("utf-8")
    elif packet["meta_info"]['type'] == 'text':
        logger.info(f"Sending text response {packet['data']}")
        data = packet['data']

# sending of pre-mark message
if packet["meta_info"]['type'] == 'audio':
    pre_mark_event_meta_data = {
        "type": "pre_mark_message",
    }
    mark_id = str(uuid.uuid4())
    self.mark_event_meta_data.update_data(mark_id, pre_mark_event_meta_data)
    mark_message = {
        "type": "mark",
        "name": mark_id
    }

response = {"data": data, "type": packet["meta_info"]['type']}

Proposed Solution:

class MessageType(str, Enum):
    AUDIO = "audio"
    TEXT = "text"
    MARK = "mark"
    PRE_MARK = "pre_mark_message"

9. Language Codes

Current Issues:

File: bolna/constants.py (Lines 35-66)

PRE_FUNCTION_CALL_MESSAGE = {
    "en": "Just give me a moment, I'll be back with you.",
    "ge": "Geben Sie mir einen Moment Zeit, ich bin gleich wieder bei Ihnen."
}

TRANSFERING_CALL_FILLER = {
    "en": "Sure, I'll transfer the call for you. Please wait a moment...",
    "fr": "D'accord, je transfère l'appel. Un instant, s'il vous plaît."
}

DEFAULT_LANGUAGE_CODE = 'en'

File: bolna/models.py (Line 79)

class Transcriber(BaseModel):
    provider: str
    encoding: Optional[str] = "linear16"
    language: Optional[str] = "en"
    model: Optional[str] = None
    stream: bool = True

File: API.md (Line 72)

"transcriber": {
    "encoding": "linear16",
    "language": "en",
    "provider": "deepgram",
    "stream": true
}

Proposed Solution:

class LanguageCode(str, Enum):
    ENGLISH = "en"
    GERMAN = "ge"
    FRENCH = "fr"
    SPANISH = "es"

Implementation Strategy

Phase 1: Core Infrastructure (High Priority)

  1. Create bolna/enums/ package with separate files for each enum category
  2. Replace provider mappings in providers.py
  3. Update model validations in models.py
  4. Update audio format handling across synthesizers/transcribers

Phase 2: Feature Enhancement (Medium Priority)

  1. Replace task type strings throughout the codebase
  2. Update pipeline component references
  3. Fix OpenAI model compatibility issues with enum-based approach

Phase 3: Polish (Lower Priority)

  1. Replace status/state strings
  2. Centralize message types
  3. Standardize language codes

Proposed File Structure

bolna/
├── enums/
│   ├── __init__.py
│   ├── providers.py      # SynthesizerProvider, TranscriberProvider, LLMProvider
│   ├── formats.py        # AudioFormat, AudioEncoding, ResponseFormat
│   ├── tasks.py          # TaskType, PipelineComponent, AgentType
│   ├── models.py         # OpenAIModel, OpenAICapability + capabilities mapping
│   ├── states.py         # AgentStatus, MessageType
│   └── localization.py   # LanguageCode
└── ...

Breaking Changes

  • Minimal: Most changes will be backward compatible by using str enum inheritance
  • Migration: Existing string values will continue to work during transition
  • Documentation: Update API documentation to reference enum values

Benefits After Implementation

  1. Better IDE Support: Autocompletion for all provider/format/type values
  2. Compile-time Safety: Catch invalid values before runtime
  3. Easier Extension: Adding new providers/formats becomes trivial
  4. Centralized Documentation: All valid values documented in one place
  5. Future-proof: Easy to add new OpenAI models or other providers

Estimated Impact

  • Files to modify: ~25-30 files
  • Lines of code: ~200-300 changes
  • Risk level: Low (backward compatible with proper migration)
  • Developer experience: Significantly improved

Open to discuss this out and set proper priorities here to start of the tasks needed here

Note: I have used claude to format the message to convey the message better

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions