Braintune blog

Dev Diaries 1: Tackling Complexity in Braintune.ai's API Design

Dev Diaries

Welcome to the first entry in our Dev Diaries series, where I'll delve into the inner workings of Braintune.ai and share our development journey. In this post, I'll discuss the challenges we face and the strategies we use to develop a robust AI service.
Braintune.ai's mission is to refine the interaction between users and prompts. Our workflow allows developers and prompt engineers to edit prompts via an intuitive UI, deploy these prompts in their apps, and then feed back information on prompt performance to our service. What we have then is a cyclical process of continuous feedback, analysis, and optimization, reminiscent of the MLOps model, where the focus is constant refinement and integration of models.

The Challenge: Managing Complexity in API Design

In order to cater to a diverse user base, we plan to support a range of model providers and are even considering self-hosting options in the future. Variety is key, so we're offering different types of prompts to our users.

However, flexibility in development comes with its own set of challenges. Our current MVP already has several entities to consider:
  • Three types of prompts (static, dynamic, chat) that produce different events
  • Just one model provider (OpenAI), but even with it, there are two modes of work: streaming and single-response
  • Several pricing models, not only for different providers but also within OpenAI integration, which has separate pricing for GPT-4. Additionally, Anthropic charges on a character level
These complexities wouldn't be a significant issue if OpenAI itself had a single design for both types. However, it currently returns usage stats only with single-response usage, not with streaming mode. In one case, we can aggregate what the user sends us, but in another, we need to count tokens ourselves.
And the biggest problem here is that this part should be public, and I believe that good API design is key to success in our venture.

Our Initial Approach: A Single Endpoint

To tackle this problem, we initially wanted to provide a single endpoint, with all the logic related to switching between event types delegated to a single factory. In this approach, different value object classes would be used depending on discriminators.
@router.post(  
    "/",  
    tags=tags, 
    operation_id="send_event",  
    response_model=EventIdResponse,  
    status_code=status.HTTP_202_ACCEPTED,  
)  
async def send_event(  
    event_input: StandardEventInput | OpenAiEventInput,  
    user: UserEntity = Depends(use_api_key_to_get_user),  
) -> EventIdResponse:  
	...
While this seemed like an easy-to-understand implementation for API users, we decided it was better not to hide all the complexity. Exposing a more nuanced model to the user reduces error-prone logic and allows for more design possibilities. Striking a balance between exposing all internals at the API level and providing a minimal interface can be challenging.

The Improved Approach: Exposing Specific Models

So, we opted to expose more specific models to the user like this:
@router.post(  
    "/events/openai/simple",  
    tags=tags,  # type: ignore  
    operation_id="send_open_ai_event",  
    response_model=EventIdResponse,  
    status_code=status.HTTP_202_ACCEPTED,  
    summary="Transmit events generated using the OpenAI API.",  
)  
async def send_open_ai_event(  
    event_input: OpenAiEventInput,  
    user: UserEntity = Depends(use_api_key_to_get_user),  
) -> EventIdResponse:  
    ...
  
@router.post(  
    "/events/openai/streaming",  
    tags=tags,  # type: ignore  
    operation_id="send_open_ai_streaming_event",  
    response_model=EventIdResponse,  
    status_code=status.HTTP_202_ACCEPTED,  
    summary="Send events, produced by using OpenAI API with streaming enabled.",  
)  
async def send_open_ai_streaming_event(  
    event_input: OpenAiStreamingEventInput,  
    user: UserEntity = Depends(use_api_key_to_get_user),  
) -> EventIdResponse:
	...
This approach gives the API a sense of broadness and reduces complexity for the application. It also allows us to make our intent clearer to users through API design, which will be beneficial as we grow and include more types of events.

Managing Internal Complexity: Polymorphism

To manage the complexity within the application, we can utilize polymorphism. For example, all event value objects may have a defined set of methods related to calculating usage statistics:
class OpenAiEvent(Event):
    def calculate_token_count(self):
        return self.completion_tokens + self.prompt_tokens

class StaticPromptEvent(Event):
    def calculate_token_count(self):
        # logic specific to StaticPromptEvent
        ...
Or here:
class Pricing:
    def calculate_cost(self, tokens):
        raise NotImplementedError

class OpenAiClassicPricing(Pricing):
    def calculate_cost(self, tokens):
        # logic specific to OpenAiClassicPricing
        ...

class OpenAiGpt4Pricing(Pricing):
    def calculate_cost(self, tokens):
        # logic specific to OpenAiGpt4Pricing
        ...
This blend of exposing some complexity to the user and managing internal complexity with polymorphic classes helps us navigate the ever-increasing complexity of the application. But, there's always more to learn and do, so stay tuned for more insights into our dev journey in the upcoming posts!