{"id":26411,"date":"2024-09-15T15:48:44","date_gmt":"2024-09-15T13:48:44","guid":{"rendered":"https:\/\/blog.mi.hdm-stuttgart.de\/?p=26411"},"modified":"2024-09-15T15:48:47","modified_gmt":"2024-09-15T13:48:47","slug":"costudy-rag-in-aws","status":"publish","type":"post","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/","title":{"rendered":"CoStudy (Rag in AWS)"},"content":{"rendered":"\n<div class=\"wp-block-jetpack-markdown\"><h3>Content<\/h3>\n<ol>\n<li>\n<p>RAG FastAPI<\/p>\n<ol>\n<li>The task<\/li>\n<li>Challenges<\/li>\n<li>Introduction to RAG<\/li>\n<li>The Journey\n<ol>\n<li>Deciding on the technologies\n<ol>\n<li>Easy Decisions<\/li>\n<li>The not so easy decisions<\/li>\n<\/ol>\n<\/li>\n<li>Development\n<ol>\n<li>Read in the PDF<\/li>\n<li>The text-splitter.<\/li>\n<li>Implementing the vector store<\/li>\n<li>Creating a retriever<\/li>\n<li>Creating retrieval chains<\/li>\n<li>Multi-Tenancy<\/li>\n<\/ol>\n<\/li>\n<li>Improve Architecture\n<ol>\n<li>The Architecture<\/li>\n<li>Extensibility<\/li>\n<li>Interchangeability<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<\/li>\n<li>Learnings<\/li>\n<li>Development Team Review<\/li>\n<\/ol>\n<\/li>\n<li>\n<p>Frotend<\/p>\n<ol>\n<li>How Did We Approach the Frontend?<\/li>\n<li>Challenges We Faced<\/li>\n<li>What Could We Do Better?<\/li>\n<li>Looking Ahead<\/li>\n<\/ol>\n<\/li>\n<li>\n<p>Backend<\/p>\n<ol>\n<li>Task<\/li>\n<li>The journey<\/li>\n<li>Challenges<\/li>\n<li>Lessons Learned<\/li>\n<li>Summary<\/li>\n<\/ol>\n<\/li>\n<li>\n<p>Deployment<\/p>\n<ol>\n<li>How did we proceed?<\/li>\n<li>Deployment process<\/li>\n<li>Advantages and challenges\n<ol>\n<li>Advantages<\/li>\n<li>Challenges<\/li>\n<\/ol>\n<\/li>\n<li>Looking left and right\n<ol>\n<li>Alternative deployment strategy<\/li>\n<li>Future developments<\/li>\n<\/ol>\n<\/li>\n<li>What could we do better?<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<h1>RAG FastAPI<\/h1>\n<h3>1. The Task<\/h3>\n<p>The task was to create a backend service, that would enable us to upload and communicate with documents.\nBetter known as Retrieval Augmented Generation (RAG).<\/p>\n<p>The goal of this subproject was to learn about the general topic of RAG and working with Frameworks like Langchain and Llamaindex to harness the power of an Larg Language Model (LLM).\nThe personal goals in this project where to learn about the general topic of RAG, to learn about the different services and frameworks that are used in the process.<\/p>\n<h3>2. Challenges<\/h3>\n<ul>\n<li>Converting a single process into a an abstracted Backend Service<\/li>\n<li>Implementing a multi-tenancy system<\/li>\n<li>Learning about the different services and frameworks that are used in the process<\/li>\n<\/ul>\n<h3>3. Introduction to RAG<\/h3>\n<p>RAG combines the process of retrieving relevant documents, and based on that generating a answer to a given question.\nIt is also known as the Hello-World of working with LLMs\nFollowing subchapters will explain the process in more detail, by explaining provided image.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2024\/09\/CoStudy2.jpeg\" alt=\"RAG-Process\"><\/p>\n<p><a href=\"https:\/\/www.youtube.com\/watch?v=dXxQ0LR-3Hg&amp;t=713s\">Chat with Multiple PDFs | LangChain App Tutorial in Python<\/a><\/p>\n<h4>3.1 Encoding new documents (Upload)<\/h4>\n<p>We start from the left side of the image, by chunking the documents into smaller parts.\nThis means you read in a document like a pdf, extract the raw text and then split it by a defined logic into smaller parts.\nThis generated chunks are then encoded into a vector representation of their values.<\/p>\n<p>A sentence like this:<\/p>\n<p><code class=\"\" data-line=\"\">Optimismus hinsichtlich Apples KI-Ambitionen hinkt das Papier des iKonzerns schon seit Monaten im Branchenvergleich hinterher.<\/code><\/p>\n<p>Is then encoded into a vector like this:<\/p>\n<p><code class=\"\" data-line=\"\">[0.02661253,-0.011285348,0.01718915,0.050273158,0.018801343,-0.03608132 ...]<\/code><\/p>\n<p>This is what we then call an embedding.\nThose vectors \/ embeddings are then stored in a vector store.<\/p>\n<p>A vector store is a n-dimensional space where those embeddings, representing our documents, are stored.\nBut why do we even need this and how will this look like?<\/p>\n<p>Down below you can see a 2d representation of a vector store.\nIn reality, this vector store has 1536 dimensions.<\/p>\n<p>In this example i uploaded two documents, both containing analysis of how Apple and Nvidias Stockmarket price will develop in the future, with a special focus on their AI ambitions.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2024\/09\/vector-visualization.jpg\" alt=\"Vector Store Visualization\"><\/p>\n<p>The image illustrates points representing document chunks in a vector space.\nChunks with similar content are positioned closer together.<\/p>\n<p>For instance, Nvidia-related chunks form a tight cluster due to their focused content, while Apple-related chunks are more dispersed because they cover diverse topics like the new iPhone release, stock prices, and AI ambitions.<\/p>\n<p>To find similar chunks, we use cosine similarity, which measures the angle between vectors. Smaller angles indicate higher similarity, allowing us to identify and retrieve the most similar chunks effectively.<\/p>\n<h2>3.2 RAG QA Process<\/h2>\n<p>When we have uploaded our documents to the vector store, we can start the retrieval process.\nFor that we take a look at out picture again.\nThe RAG process starts at the top right corner of the image.\nThe user asks a question, like: \u201cShould i buy Apple or Nvidia stocks?\u201d<\/p>\n<p>This question is then encoded into a vector representation.\nWe then pass this vector to the vector store which then searches for the most similar vectors to the question, by applying the cosine similarity method.\nThe most similiar vectors are then returned an can be ranked by their relevance.\nThis step is not necessary, but provides better results.<\/p>\n<p>Then both the question and the retrieved documents are passed to the LLM.\nAt the end the user gets an answer to his question, that is based on the retrieved documents.<\/p>\n<h1>4. The Journey<\/h1>\n<p>So now that you have a general understanding of how RAG works, i will explain how we approached this project, and tell you which decisions we had to make.<\/p>\n<h2>4.1 Deciding on the technologies<\/h2>\n<p>After the initial research it was time to choose the technologies we would use to implement Application.<\/p>\n<p>Some of the decisions where easy to make, some where not.\nSo first for the easy ones.<\/p>\n<h3>4.1.1 Easy Decisions<\/h3>\n<ul>\n<li><strong>Python<\/strong>:\nPython is the most used language for working with Artificial Intelligence and Data analytics in general.\nSo it was clear for me that i would use Python for this part of the project. But as a sidenote there are also other languages that can be used to implement RAG, like Java or Javascript.<\/li>\n<li><strong>FastAPI<\/strong>:\nFastAPI is a modern, fast (high-performance), web framework for building APIs with Python.\nI previously worked with FastAPI and i really liked the simplicity and features like automatic swagger documentation and deep integration with Pydantic.<\/li>\n<\/ul>\n<h3>4.1.2 The not so easy decisions<\/h3>\n<p>After we decided on the general technologies, we had to pick a framework and the services we would use to implement the RAG Application.<\/p>\n<p>This was quite a challenge. Because of every day new emerging technologies and frameworks, it was hard to decide on what to use.\nBased on the visibility of content on youtube and medium we decided to take a closer look into langchain and llamaindex.<\/p>\n<h4>Langchain vs. Llamaindex<\/h4>\n<p><strong>&#8211; LangChain<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th><strong>Pros<\/strong><\/th>\n<th><strong>Cons<\/strong><\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>&#8211; Flexible and Customizable (Complex Chains)<\/td>\n<td>&#8211; Due to the customizability, beginners have a steeper learning curve<\/td>\n<\/tr>\n<tr>\n<td>&#8211; Suited for several interacting tasks like Chatbots, Retrieval, Agents<\/td>\n<td>&#8211; Larger Community -&gt; more resources for research -&gt; can be overwhelming<\/td>\n<\/tr>\n<tr>\n<td>&#8211; Larger Community<\/td>\n<td><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><strong>&#8211; Llamaindex<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th><strong>Pros<\/strong><\/th>\n<th><strong>Cons<\/strong><\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>&#8211; Specialized in retrieval and search tasks<\/td>\n<td>&#8211; Primarily focused on search and retrieval (Now also supports Agents, so my own analysis is already outdated)<\/td>\n<\/tr>\n<tr>\n<td>&#8211; Better developer experience<\/td>\n<td>&#8211; Less customizable than LangChain<\/td>\n<\/tr>\n<tr>\n<td>&#8211; Less resource intensive than LangChain<\/td>\n<td>&#8211; Smaller Community<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>For a more detailed explanation:\n<a href=\"https:\/\/www.datacamp.com\/blog\/langchain-vs-llamaindex\">Comparison of LangChain and Llamaindex<\/a><\/p>\n<p>To implement a basic RAG application, we initially chose LlamaIndex due to its clear focus on retrieval tasks.\nHowever, we were more leaning towards LangChain because of its flexibility and customization options.\nWe felt that due to langchain\u2019s larger community, we would have more resources for research and troubleshooting.\nAlso we wanted to learn a framework, that would enable me for even more complex projects in the future.<\/p>\n<h4>Deciding on the vectore store<\/h4>\n<p>So now that the first step was taken, we had to decide on the vector store we would use.\nUntil now we never were confronted with this type of database, so we did not know what key aspects were important for the decision.\nA very helpful article to this topic was following. It compares popular vector databases and gives a clear overview of the pros and cons of each.\n<a href=\"https:\/\/medium.com\/the-ai-forum\/which-vector-database-should-you-use-choosing-the-best-one-for-your-needs-5108ec7ba133\">Which Vector Database to choose<\/a><\/p>\n<p>Based on this article, we chose QDRANT as the vector store service because of its high performance and the ability to self-host it as a publicly available Docker image.<\/p>\n<h4>What LLM should we use ?<\/h4>\n<p>When we started with the project, we were eager to use AWS and primarily it\u2019s services.\nSo based on this we wanted to use AWS Bedrock as our LLM provider.\nAWS Bedrock is a service that provides a set of LLM Models that are hosted on AWS infrastructure.\nFrom the many LLMs available on AWS Bedrock, we chose Claude 3 Haiku because it was one of the cheapest models available with a good performance.<\/p>\n<p>But things changed shortly after our presentation.\nIn our group we discussed to switch to Microsoft Azure as our Cloud Provider.<\/p>\n<p>Because of the uncertainty which Cloud Provider we would use, we decided to look for a service that would not be bound to a specific Cloud Provider.\nSo we decided to go with OpenAI\u2019s API for GPT-3.5 turbo and text-embedding-3-small for the meantime.\nWith it\u2019s state-of-the-art performance and yet very low cost, even compared to Claude 3 Haiku, this was a clear choice to make.<\/p>\n<h2>4.2 Development<\/h2>\n<p>After all the general decisions where made, it was time to start the project.\nWe generated an OpenAI API Key and started a new FastAPI project.<\/p>\n<p>Because of the extensive research we did, we had plenty of tutorials to follow.\nBut none of them was a 1:1 fit for our project.<\/p>\n<p>Nearly all of the tutorials we watched showed a single python script with a few lines of code that would handle the RAG Process, paired with a streamlit frontend.\nBesides that, the code was often outdated and did not work anymore.<\/p>\n<p>This was quite a challenge for us, and we had to primarily rely on the documentation of langchain, which is a challenge of its own.\nLancghain is a very powerful framework, but the documentation is not very beginner friendly.\nThe documentation on their website often covers only the basic functions and does not provide a clear overview of the different methods and their parameters.<\/p>\n<h3>4.2.1 Read in the PDF<\/h3>\n<p>The API endpoint is expecting a PDF file to be uploaded via multiplatform.\nIt is a more efficient way to transfer files, compared to base64 encoding.<\/p>\n<p>In our project we used the PyMuPDF library to read in the PDF file.\nIt the fastest library to read in PDFs with a very good performance in recognizing text.\nBut it also has some downsides.\nIt can\u2019t read in Files as bytecode.\nSo we need to temporarily save the file on the server, before we can read it in.\nThis caused some problems on windows devices, because the os is blocking the file release after reading it in\nWe could not solve this problem entirely. But this was also not our goal because the application will run on a linux environment.<\/p>\n<pre><code class=\"\" data-line=\"\">class PDFReaderService(PDFReader):\n    @staticmethod\n    @logger.log_decorator(level=&quot;debug&quot;, message=&quot;Reading in PDF&quot;)\n    async def read_pdf(file):\n        # Create temporary file\n        temp_file = tempfile.NamedTemporaryFile(delete=False)\n        temp_filename = temp_file.name\n        try:\n            async with AIOFile(temp_filename, &#039;wb&#039;) as afp:\n                await afp.write(file)\n                await afp.fsync()\n\n            # Read in PDF\n            loader = PyMuPDFLoader(file_path=temp_filename)\n            data = loader.load()\n\n            # Delete resources\n            del loader\n\n        finally:\n            # Ensure file is not blocked by the os\n            time.sleep(0.1)\n            os.unlink(temp_filename)\n\n        # return data\n        return data\n<\/code><\/pre>\n<h3>4.2.2 The text-splitter.<\/h3>\n<p>The test-splitting process is crucial for the performance of the retrieval process.<\/p>\n<p>There are multiple ways and implementations to split the text into smaller parts.\nEverone has its pros and cons.<\/p>\n<p>Here we will just cover two approaches:<\/p>\n<ul>\n<li><strong>Split by Characters<\/strong>:\nSplits the text exactly after a defined amount of characters with the option for a sliding window.\nThis is the most basic text-splitter and is used in all of the tutorials. It creates chunks of the same size, but does not respect the language context.<\/li>\n<li><strong>Split by Tokens<\/strong>:\nA more advanced text-splitter, that considers the semantic meaning of the text.\nThere are several implementation of this, again with their pros and cons.\nWe opted for the SpaCy Text-Splitter, that uses a powerful NLP library to convert sentences into tokens.\nThis results in a more linguistic coherent text chunks with better context representation.\nAlso the Tokenization model gets downloaded from the SpaCy website, is not that large in size and can be used for free.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/towardsdatascience.com\/how-to-chunk-text-data-a-comparative-analysis-3858c4a0997a\">Comparison of Text-Splitters performance<\/a><\/p>\n<p><a href=\"https:\/\/medium.com\/@sushmithabhanu24\/retrieval-in-langchain-part-2-text-splitters-2d8c9d595cc9\">Overview Langchain Text Splitters<\/a><\/p>\n<p><a href=\"https:\/\/python.langchain.com\/v0.1\/docs\/modules\/data_connection\/document_transformers\/\">Langchain Documentation Text Splitters<\/a><\/p>\n<pre><code class=\"\" data-line=\"\">class TextSplitterService(TextSplitter):\n    def __init__(self, chunk_size: int):\n        self.splitter = SpacyTextSplitter(chunk_size=chunk_size)\n\n    @logger.log_decorator(level=&quot;debug&quot;, message=&quot;Creating chunks&quot;)\n    def split_text(self, text, document_id: str, owner_id: str, conversation_id: str) -&gt; List[ChunkModel]:\n        chunks = []\n        for page_number, page in enumerate(text, start=1):\n            page_chunks = self.splitter.split_text(page.page_content)\n            for on_page_index, chunk in tqdm(enumerate(page_chunks, start=1), total=len(page_chunks)):\n                metadata = ChunkMetadata(document_id=document_id,\n                                         owner_id=owner_id,\n                                         conversation_id=conversation_id,\n                                         page_number=page_number,\n                                         on_page_index=on_page_index)\n                chunk_obj = ChunkModel(content=chunk, metadata=metadata)\n                chunks.append(chunk_obj)\n        return chunks\n<\/code><\/pre>\n<h3>4.2.3 Implementing the vector store<\/h3>\n<p>So now that we have our textchunks, we need to embed them and save them in the vector store.\nLangchain combines these two steps in one process.\nThis makes the implementation very easy and straightforward.\nBut makes abstraction of the process a bit harder.<\/p>\n<pre><code class=\"\" data-line=\"\">@logger.log_decorator(level=&quot;debug&quot;, message=&quot;Add chunks to collection&quot;)\ndef add_chunks(self, chunks: List[ChunkModel], embedding_model: EmbeddingModel) -&gt; bool:\n    documents = [Document(page_content=chunk.content, metadata=chunk.metadata) for chunk in chunks]\n\n    try:\n        QdrantVectorStore.from_documents(\n            documents=documents,\n            embedding=embedding_model.get_model(),\n            url=os.getenv(&quot;VECTOR_STORE_URL&quot;),\n            collection_name=os.getenv(&quot;VECTOR_STORE_COLLECTION&quot;),\n            metadata_payload_key=&quot;metadata&quot;,\n        )\n        return True\n    except:\n        return False\n<\/code><\/pre>\n<h3>4.2.4 Creating a retriever<\/h3>\n<p>The first part of the RAG process is done.\nNow we want to create the retrieval process.<\/p>\n<p>We are using the database as our retriever, using the as_retriever interface.\nThis is the most basic retriever, but it will be enough for the first iteration of the project.<\/p>\n<p>There are generally two types of retrievers:<\/p>\n<ul>\n<li><strong>Similarity Retriever<\/strong>:\nThis retriever fetches the most similar documents to a given question.\nIt is the most basic retriever and is used in all of the tutorials.<\/li>\n<li><strong>MMR Retriever<\/strong>:\nThis retriever fetches the most relevant documents to a question, and then pass them to an algorithm that find the most diverse documents.\nBy doing this, the user gets a broader view on the topic and can make a better decision.\nEspecially in case of a bachelor thesis, this can be very helpful to cite different sources.<\/li>\n<\/ul>\n<pre><code class=\"\" data-line=\"\">    def get_mmr_retriever(self,\n                          user_id: str,\n                          document_id: Optional[str] = None,\n                          conversation_id: Optional[str] = None,\n                          k: int = int(MAX_K_RESULTS)):\n\n        must_conditions = self._filter_conditions(user_id=user_id,\n                                                  document_id=document_id,\n                                                  conversation_id=conversation_id)\n\n        vector_store_connection = self.vector_store.get_connection(embedding_model=self.embedding_model)\n        return vector_store_connection.as_retriever(\n            search_type=&quot;mmr&quot;,\n            search_kwargs={\n                &quot;k&quot;: k,  # Number of documents to return; Default is 5\n                &quot;fetch_k&quot;: 20,  # Number of documents to pass into mmr algorithm; Default is 20\n                &quot;lambda_mult&quot;: LAMBDA_MULT,  # Diversity of Documents. Default = 0.5, Minimum = 1, Maximum = 0\n                &quot;filter&quot;: models.Filter(\n                    must=must_conditions  # Filter for metadata\n                )\n            }\n        )\n<\/code><\/pre>\n<h3>4.2.5 Creating retrieval chains<\/h3>\n<p>Now that we have our retriever, we need to combine it with the LLM process.\nFor this we need to create a chain.\nChains are a way to combine different tasks whether to a LLM or data process.<\/p>\n<p>But at this point, things are getting confusing.\nWhen we started implementing the project, the documentation did not provide a clear overview of the different methods and their parameters.\nTutorials were also no help, because they were outdated and did not work anymore.\nSo making this work, was a challenge.<\/p>\n<p>For the retrieval process we needed to create three chains.<\/p>\n<ol>\n<li><strong><code class=\"\" data-line=\"\">create_history_aware_retriever<\/code><\/strong> &#8211; This chain is responsible for fetching relevant documents belonging to the user, with taking the chat history into account.<\/li>\n<li><strong><code class=\"\" data-line=\"\">create_stuff_documents_chain<\/code><\/strong> &#8211; This chain passes the retrieved documents to the LLM.<\/li>\n<li><strong><code class=\"\" data-line=\"\">create_retrieval_chain<\/code><\/strong> &#8211; This chain combines the first two chains and returns the answer to the user.<\/li>\n<\/ol>\n<p>*Chathistories are a collection of previous messages between the user and the LLM.\nBy providing the chathistory, users are able to ask follow up questions on topics dicussed before.<\/p>\n<p>For the chains we are also adding prompts for the llm to improve the quality of the answers.\nThe goal was, so get an answer that is as close to the truth as possible.\nThis was very important for the goal of the project.<\/p>\n<p>At the same time i had to instruct the llm to only answer questions that are related to the documents that where retrieved.<\/p>\n<p>This are the prompts we use to instruct the llm.\nThe goals were:<\/p>\n<ul>\n<li>Consider the chathistory<\/li>\n<li>Give the LLM an identity<\/li>\n<li>Limit the LLM to awnser only questions that are related to the retrieved documents.<\/li>\n<li>Limit the length of the answer to a maximum of 5 sentences.\n<ul>\n<li>We have a hard token limit implemented, but adding this instruction to the prompt, we were able to prevent the LLM to cut off the answer in the middle of a sentence.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><strong>Prompt for <code class=\"\" data-line=\"\">create_history_aware_retriever<\/code><\/strong><\/p>\n<pre><code class=\"\" data-line=\"\">Given a chat history and the latest user question which might \nreference context in the chat history, formulate a standalone question which can be understood without the chat \nhistory. Do NOT answer the question, just reformulate it if needed and otherwise return it as is.\n<\/code><\/pre>\n<p><strong>Prompt for <code class=\"\" data-line=\"\">create_stuff_documents_chain<\/code><\/strong><\/p>\n<pre><code class=\"\" data-line=\"\">You are a helpful assistant for question-answering tasks.\nUse only the following pieces of retrieved context to answer the question.\nDon\u2019t justify your answers.\nDon\u2019t give information not mentioned in the CONTEXT INFORMATION.\nIf you don&#039;t know the answer, just say that you don&#039;t know.\nUse five sentences maximum and keep the answer concise.\n{context}\n<\/code><\/pre>\n<h3>4.2.6 Multi-Tenancy<\/h3>\n<p>A very important aspect of the project was the multi-tenancy.\nThis means that multiple users can use the same service, without interfering with each other.<\/p>\n<p>This is done by adding meta datas to the embeddings, that are added to the vectorstore.\nThis metadata filters need to be provided in the document retriever within the chain.<\/p>\n<p>Here is an example of a simple similarity search with metadata filters<\/p>\n<p><strong>Search Function with Filters<\/strong><\/p>\n<pre><code class=\"\" data-line=\"\">return self.client.scroll(\n    collection_name=COLLECTION_NAME,\n    scroll_filter=models.Filter(\n        must=[\n            models.FieldCondition(\n                key=&quot;metadata.owner_id&quot;,\n                match=models.MatchValue(value=user_id)\n            ),\n            models.FieldCondition(\n                key=&quot;metadata.document_id&quot;,\n                match=models.MatchValue(value=document_id)\n            )\n        ]\n    ),\n    with_payload=True,\n    with_vectors=False,\n)\n<\/code><\/pre>\n<h3>4.3 Improve Architecture<\/h3>\n<p>During our incremental development, we managed to get a running RAG application, but the codebase was disorganized.<br>\nTypically, the code is executed as a single script.\nHowever, I attempted to split the code into different files and classes.\nGiven the complexity of the RAG process and its reliance on multiple interdependent services, it was essential to structure the code in a maintainable and comprehensible manner.<\/p>\n<p>For the refactoring process we set following goals:<\/p>\n<ul>\n<li>The architecture should give a direct understanding of the business logic it represents.<\/li>\n<li>The code should be extensible in the future. For example implementing summarization, agents or other services.<\/li>\n<li>The modules should be interchangeable. For example switching the vector store from QDRANT to Faiss should be easy to implement.<\/li>\n<li>Objects needed for functions should be passed in as parameters, instead of being created in the function itself.<\/li>\n<\/ul>\n<h4>4.3.1 The Architecture<\/h4>\n<p>We decided to use a Domain Driven Design (DDD) approach.\nDDD is a software development approach that focuses on the domain of the problem, instead of the technical aspects.\nIt is a way to structure the code in a way that it is understandable for everyone, even if they are not familiar with the technology used.<\/p>\n<p>I identified the following <strong>core\/domains<\/strong>:<\/p>\n<ul>\n<li><strong>chunks<\/strong>: Logic for maneging the splitted text chunks from uploaded documents<\/li>\n<li><strong>qa<\/strong>: The question and answer process.<\/li>\n<li><strong>retriever<\/strong>: Holds the logic for retrieving documents from the vector store.<\/li>\n<li><strong>upload<\/strong>: The process of uploading documents to the vector store.<\/li>\n<\/ul>\n<p>The business logic processes that orchestrate all objects that are in the domains and any other directories are placed in the <strong>core\/services<\/strong> directory.\nIt contains the logic for the RAG QA and uploading documents process.<\/p>\n<p>Directory <strong>core\/external_services<\/strong> holds the implementations of all external services<\/p>\n<ul>\n<li><strong>llm<\/strong>: connection to the llm<\/li>\n<li><strong>embedding<\/strong>: connection to the embedding service<\/li>\n<li><strong>vector_store<\/strong>: connection to the vector store<\/li>\n<\/ul>\n<h4>4.3.2 Extensibility<\/h4>\n<p>The code should be easy to extend in the future.\nEither whole new processes like WebCrawling Agent can be added as a new domain.\nIt can utilize the existing connections to the llms and vector stores.<\/p>\n<p>If another retriever is needed, it can be added to the retriever directory and be imported to any chain in any domain.\nIf another external service is needed, like a mongodb database for storing additional data, it can be added to the external_services directory.<\/p>\n<h4>4.3.3 Interchangeability<\/h4>\n<p>All implemented services and connections are implemented via interfaces.<\/p>\n<p>For example the Embedding Modell has a interface called EmbeddingModel with abstract methods.<\/p>\n<p>The function for adding new chunks to the database, takes the interface EmbeddingModel as a parameter.\nThis way the function can be used with any embedding model that implements the EmbeddingModel interface.\nSo we can seamlessly switch from OpenAI Embedding to AWS embedding models, without changing the function.<\/p>\n<h1>5. Learnings<\/h1>\n<p>Besides the theoretical knowledge about RAG and their technical implementations, we had some general learnings.<\/p>\n<ul>\n<li>Tutorials are nice to get an idea, but don\u2019t rely on them to much.<\/li>\n<li>Young technologies and new frameworks are prone to change. Be prepared and always look for the newest information.<\/li>\n<li>A large community can provide a lot of resources, but can also be overwhelming. A good documentation is key.<\/li>\n<\/ul>\n<h1>6. Review<\/h1>\n<p>We successfully reached our goals in this project.\nBut there are still a lot of improvements to be made.<\/p>\n<p>We solely focused on the RAG implementation, but neglected security aspects like authentication and authorization.\nEven the database is fully open to the public, without any security measures.\nThe next steps would be to implement security measures like Oauth2 and JWT for role based access.<\/p>\n<p>We built a basic RAG application, but there are many improvements that can be made.\nThe current retriever using the MMR-algorithm does not provide a score threshold. So even documents having only a relevance score of 20% are provided to the LLM.\nThere are also some more advanced retriever techniques like a multi query retriever, to fetch more and better fitting documents, that could be relevant to the user.<\/p>\n<p>We wanted to implement custom error handling messages. Due to our current implementation we only return a 500 error message, if something goes wrong.\nOther error messages like 4xx are also caught as 500 errors.<\/p>\n<p>Because we chose PyMuPDF as our PDF reader, we always need to create a temporary file to read in the PDF.\nThis is not very efficient and can result in problems when the user base grows.<\/p>\n<p>Also the current upload process is blocking the server, because the file is read in synchronously.\nIt should be converted into an asynchronous background task, to prevent the server from blocking.<\/p>\n<p>Lastly even with improvements in the architecture, it is still not a 100% DDD implementation.\nBut with the current state of the project, it is a good starting point for further development.<\/p>\n<p>Nevertheless, we had fun working on this project.\nWe learned about the practical implementation of RAG and Langchain, besides the typical tutorials.\nWith it\u2019s diverse nature, we covered many aspects to create a service that can be useful for many different people.<\/p>\n<h1>Frontend<\/h1>\n<p>After successfully setting up the backend architecture to handle communication with OpenAI and manage user data, our focus shifted to developing a robust frontend. The frontend serves as the interface for users, acting as a bridge between them and our backend services. In this blog post, we want to share our approach to frontend development and the decisions we made along the way.<\/p>\n<h3>How Did We Approach the Frontend?<\/h3>\n<p>For frontend development, we chose React, one of the most popular JavaScript libraries for building user interfaces. React\u2019s component-based architecture makes it easy to create dynamic and interactive UIs, aligning well with modern web development practices. To ensure a consistent design language and enhance visual appeal, we used Material-UI. This library provides us with pre-built components such as buttons, cards, and navigation elements that adhere to Material Design principles. This not only sped up our development process but also ensured that our app maintained a professional and appealing look.<\/p>\n<p>For handling HTTP requests and communicating with our backend services, we used Axios. This library allows us to perform CRUD operations efficiently and manage requests and responses in a promise-based structure, which suits React\u2019s asynchronous nature well.<\/p>\n<h3>Challenges We Faced<\/h3>\n<p>During the development of the frontend, we encountered several challenges:<\/p>\n<p><strong>1. CORS Issues:<\/strong> One of the recurring challenges was dealing with <strong>Cross-Origin Resource Sharing (CORS)<\/strong> issues. Since our backend was hosted on a different domain or port than our frontend, we often encountered CORS policy errors when trying to make requests to the backend. To resolve these issues, we had to configure our backend properly to handle CORS and ensure that our frontend requests were correctly formatted to avoid these errors. Despite our efforts, these problems occasionally resurfaced, requiring continuous tweaks and adjustments.<\/p>\n<p><strong>2. State Management:<\/strong> Managing the state of the application was another challenge, especially with multiple asynchronous requests and user interactions. Although we did not use a dedicated state management library like Redux, we carefully managed the state within individual components and utilized React\u2019s Context API for global state where necessary. This approach required careful planning to ensure that state changes were handled efficiently.<\/p>\n<p><strong>3. Handling Asynchronous Data:<\/strong> Given that our application heavily relies on data fetched from the backend (such as user data, chat histories, and document information), handling asynchronous operations correctly was crucial. We needed to ensure that the UI remained responsive and provided meaningful feedback to users while data was being loaded or submitted. This involved implementing loading spinners, error handling, and retry mechanisms to improve the overall user experience.<\/p>\n<h3>What Could We Do Better?<\/h3>\n<p>While we are proud of what we achieved with the frontend, there is always room for improvement. For instance, we could refine our approach to state management further, possibly by integrating a more robust state management library if the complexity of our application grows. We also recognize that there are additional optimizations and best practices we could adopt to enhance performance and security.<\/p>\n<p>Although we managed to resolve most CORS-related issues, a deeper understanding of CORS policies and security headers would have been beneficial earlier in the process. This would have allowed us to build a more secure frontend from the start.<\/p>\n<h3>Looking Ahead<\/h3>\n<p>Our journey with frontend development was filled with learning and growth. We gained valuable experience in handling common challenges, optimizing performance, and ensuring a great user experience. Moving forward, we aim to refine our skills, explore new technologies, and continue building user-friendly applications.<\/p>\n<h1>Backend<\/h1>\n<h3>Task<\/h3>\n<p>First let\u2019s talk about what the goal was.\nThe main job of our backend was to take users\u2019 documents and answer their prompts based on these documents. Therefore it had many smaller tasks like getting requests from the frontend, store user data and chat histories as well as the uploaded documents and of course communicate with OpenAI.<\/p>\n<p>Therefore there are 2 different types of tasks. On the one hand we have communicating with OpenAI and handle all the \u201cAI stuff\u201d with the documents and on the other hand we have storing and retrieving all the data (especially user data), as well as communicating with the frontend.<\/p>\n<p>So what we did was split those two into two different backends that will be linked together.<\/p>\n<p>In this part we want to give you some insides of our decision making, how and why we solved our challenges the way we did, as well as outline what we learned.<\/p>\n<h3>How did we approach the backend?<\/h3>\n<p>First we started with getting the connection to OpenAI on a basis where we knew it works. We decided to do this first because we felt it to be the most challenging. For more about this read the part about our RAG system. In this part we will focus on the Node.js of the backend.<\/p>\n<p>So when we got to the point where the FastAPI (the RAG) was ready and working. We decided to let the FastAPI handle requests to OpenAI and work with the documents while the rest of the logic including the storing of user data will be done in a different backend. Here we went for Node.js with a MongoDB, because:<\/p>\n<ul>\n<li><strong>1. JavaScript Everywhere:<\/strong>\u00a0Node.js allows us to use JavaScript on both the frontend and backend, which streamlines the development process<\/li>\n<li><strong>2. Non-blocking I\/O Model:<\/strong>\u00a0Node.js operates on a non-blocking, event-driven architecture, making it well-suited for handling multiple concurrent connections efficiently.<\/li>\n<li><strong>3. MongoDB\u2019s Flexibility:<\/strong>\u00a0MongoDB, being a NoSQL database, offers schema flexibility, which is perfect for projects where data structures might evolve over time.<\/li>\n<li><strong>4. Microservices Architecture:<\/strong>\u00a0Using Node.js for the backend aligns well with our decision to separate concerns and adopt a microservices architecture. Each service can be developed, deployed, and scaled independently, which improves the flexibility and maintainability of our system.<\/li>\n<li><strong>5. Rich Ecosystem:<\/strong>\u00a0Node.js comes with a vast ecosystem of open-source libraries via npm (Node Package Manager). This speeds up development as we can leverage existing libraries for everything from authentication to data validation.<\/li>\n<\/ul>\n<p>The next step was to think of the models we need for the database. This was not so easy in the beginning because we did not really knew which data we exactly need. What helped us was to make clear which functionality we were planning on implementing and thinking of what models would make sense to store and how they are related to each other. For example if we want users to register, log in and upload documents, it makes sense that we need to store all the user data and documents. Also we see that these two were related with each other. But because we wanted to give users the possibility to have multiple conversations with different documents, we decided to put a conversations model in between which stores things like the messages in the conversation, the user, the documents and so on.<\/p>\n<p>Here you can see an overview over our models and their relation to each other.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2024\/09\/CoStudy3.png\" alt=\"\"><\/p>\n<p>For the database connection we used mongoose as ODM.<\/p>\n<p>After this was done we implemented CRUD operations for all the different models. For the routes we used express as our web-framework, because it is simple yet powerfull.\nThe next step was to do the logic for important features like document upload and the \/response endpoint where users will be sending their prompts. The special thing about it was, that it has to get the last messages in the chat so the ai will have context later on. While doing the logic for the backend there were several challenges like architecture, multipart\/form-data, pdf storing, performance and security, but we want to go into detail about this in the challenges part of this blog post.\nLastly we did end2end tests, error handling and documentation.<\/p>\n<h3>Challenges<\/h3>\n<p>In the backend we had some challenges some of them we already pointed out above. Starting with Node.js being a new technology for us, as well as all the frameworks and libraries on top like mongoose and express, so getting started with it was a lot of reading, trying and repeating.<\/p>\n<p>Also a challenge was to think of the logical architecture, not only for the whole backend where we decided to go for a micro-service approach, but also thinking of how to logically structure the code and the models for the database. For the Node.js API we tried to logically split the routes from the logic and API requests to the FastAPI, while keeping everything well-organized. This was ensured by having routes for every single model like documents or conversations. The routes redirect to the controllers where the real logic is done for example storing data, getting chat histories as context for the API and so on. From there when a request has to be done, we use a service class whose job it is to put the data into a valid request and send it to the FastAPI.<\/p>\n<p>Another problem that took quite some time to solve was sending and storing the pdfs. When sending them we used the multipart\/form-data encoding in our requests, which was at least for me not so easy to handle in node.js. First we only stored and used the pdfs in binary, which lead to another problem: the performance. The binary data seemed to be very inefficient when used with the MongoDB. After requesting more than 3 pdfs the request timed out. Of course this was not optimal so we tried some libraries and finally got it to work after we did put in a lot of work.<\/p>\n<p>Finally there was one more challenge for us the solve: The security. The problem was that because the backend was deployed public, everyone could theoretically use our API to send data to OpenAI which would cost us money. To fix this we made every user to put in their own open ai API key in the frontend, which is then send with each request.<\/p>\n<h3>What could we do better?<\/h3>\n<p>Of course the way we did our project and solved our challenges is probably not perfect, it was just the way we thought we could do it and also reflects our learning path. One thing we are aware of but did not have time to optimize it is security. We tried to use the bcrypt library to hash passwords and use JWT token for security but unfortunately we did not get everything to work as we wanted, even after quite a lot hours put into it, so we decided to focus on the functionality first. Another solution we were thinking of would have been to store the API keys in the database too.<\/p>\n<p>Also we was thinking of using a different storage, especially a cloud storage to store the pdfs and only safe the URLs in the database. This might have been more efficient and better for the performance than always sending the pdfs.\nFurther, even if we did write some end2end tests, more tests would always make sense.<\/p>\n<h3>Lessons Learned<\/h3>\n<p>What did we learn while doing this project?\nFirst, obviously how to use Node.js and \u201cthe world around it\u201d to create an API with a database.\nAlso what I learned is to rather use more time to concept all the routes and models necessary instead of reworking them multiple times, this would have saved me a lot of time.\nThe next important learning was working with pdf files and how to store them efficiently and also how to do simple micro service architecture with multiple backends communicating with each other.\nAlso we learned that good swagger documentation is very important when working together as a team, especially if you use different micro services, because this way you can ensure that the ones using your API know how to form a valid request and what to expect as a response.<\/p>\n<h3>Summary<\/h3>\n<p>The backend development for this project was a comprehensive learning experience, involving the integration of Node.js and MongoDB for efficient data management, alongside FastAPI for AI interactions. The journey included overcoming challenges related to technology adoption, architectural design, and data handling. Key lessons revolved around effective planning, efficient data storage, and the benefits of a microservices approach. Despite some areas for improvement, such as security and testing, the project successfully demonstrated the capability to handle complex backend tasks and integrate with external services effectively.<\/p>\n<h1>Deployment<\/h1>\n<h1>CI\/CD pipeline for RAG system on Azure with Terraform and GitHub Actions<\/h1>\n<h2>1. Introduction<\/h2>\n<p>In modern software development, the seamless integration of Continuous Integration (CI) and Continuous Deployment (CD) is a key success factor. In this blog post, we would like to show you how you can successfully deploy a Retrieval Augmented Generation (RAG) system using a CI\/CD pipeline based on GitHub Actions in combination with Terraform.<\/p>\n<p>The focus here is on the use of Terraform to provide the infrastructure efficiently and consistently. Terraform allows us to define the entire cloud infrastructure as code, which ensures not only the repeatability but also the scalability of the setup. In addition, we use GitHub Actions to make the automation process seamless. With GitHub Actions, we can set up continuous integration and deployment workflows. This reduces manual intervention on our part and therefore minimizes sources of error.<\/p>\n<p>Since our application is to be deployed in the cloud, the first question was: Which cloud provider is the best choice? After comparing several providers, including AWS, IBM, Google, Azure and Hetzner, we decided on Azure. This cloud provider offers all the services we need for our project, allowing us to focus on the implementation. In addition, the Azure for Students program allowed us to test our strategy risk-free in a \u201creal\u201d environment.<\/p>\n<p>Our deployment strategy is based on a well-thought-out, structured architecture with four separate repositories. The first repository is responsible for our RAG (Retrieval Augmented Generation) application, where the core logic and processing takes place. The second repository manages the database. The third repository contains the code for the frontend, which forms the user interface of our application. Finally, the fourth repository, the infrastructure repository, ensures that all necessary resources are provided and managed in the cloud. As soon as changes are pushed to one of the first three repositories, GitHub Actions automatically creates a new container image and uploads it to the GitHub Container Registry. The infrastructure repository then accesses these images to deploy the applications as containers in an Azure Container Group.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2024\/09\/CoStudy4.png\" alt=\"\"><\/p>\n<h2>2. Structure of the Terraform code<\/h2>\n<h3>Terraform configuration<\/h3>\n<p>The complete Terraform code can be found here: https:\/\/github.com\/software-dev-for-cloud-computing\/infrastructure\/blob\/main\/main.tf<\/p>\n<pre><code class=\"\" data-line=\"\">main.tf\nterraform {\n  required_providers {\n    azurerm = {\n      source  = &quot;hashicorp\/azurerm&quot;\n      version = &quot;=3.107.0&quot;\n    }\n  }\n}\n\nprovider &quot;azurerm&quot; {\n  features {}\n}\n<\/code><\/pre>\n<ul>\n<li><strong>Terraform block:<\/strong> Defines the required Terraform providers and their versions. The azurerm provider is used here to interact with Azure resources.<\/li>\n<li><strong>Provider block:<\/strong> Configures the azurerm provider and activates optional functions.<\/li>\n<\/ul>\n<h3>Ressourcenbereitstellung<\/h3>\n<pre><code class=\"\" data-line=\"\">main.tf\nresource &quot;random_pet&quot; &quot;name&quot; {\n  length    = 2\n  separator = &quot;-&quot;\n}\n\nresource &quot;azurerm_resource_group&quot; &quot;main&quot; {\n  name     = &quot;myResourceGroup-${random_pet.name.id}&quot;\n  location = &quot;Germany West Central&quot;\n}\n<\/code><\/pre>\n<ul>\n<li><strong>random_pet resource:<\/strong> Generates a random name to ensure that the resource group names are unique and do not collide with existing resources.<\/li>\n<li><strong>azurerm_resource_group Resource:<\/strong> Creates an Azure resource group in which all other resources are organised. The resource group is given a unique name based on the random name generated. The location is set to \u2018Germany West Central\u2019.<\/li>\n<\/ul>\n<pre><code class=\"\" data-line=\"\">main.tf\nresource &quot;azurerm_container_group&quot; &quot;main_container&quot; {\n  name                = &quot;rag-ss-dev4coud-${random_pet.name.id}&quot;\n  location            = azurerm_resource_group.main.location\n  resource_group_name = azurerm_resource_group.main.name\n  os_type             = &quot;Linux&quot;\n  ip_address_type     = &quot;Public&quot;\n  dns_name_label      = &quot;rag-ss-dev4coud-hdm-stuttgart-2024&quot;\n...\n}\n<\/code><\/pre>\n<ul>\n<li><strong>azurerm_container_group Resource:<\/strong> Creates an Azure Container Instance group that hosts multiple containers.<\/li>\n<li><strong>name:<\/strong> Gives the container group a unique name that contains the generated random name.<\/li>\n<li><strong>location und resource_group_name:<\/strong> Assigns the container group to the previously created resource group.<\/li>\n<li><strong>os_type:<\/strong> Specifies that the containers are executed on a Linux-based operating system.<\/li>\n<li><strong>ip_address_type:<\/strong> Assigns a public IP address to the container group so that it can be reached from outside.<\/li>\n<li><strong>dns_name_label:<\/strong> Defines a DNS name for accessing the container group<\/li>\n<\/ul>\n<h3>Container-Definitions<\/h3>\n<p>The individual containers are defined within the azurerm_container_group. Each container represents a microservice of the application.<\/p>\n<pre><code class=\"\" data-line=\"\">main.tf\n\ncontainer {\n    name   = &quot;qdrant&quot;\n    image  = &quot;ghcr.io\/software-dev-for-cloud-computing\/qdrant:latest&quot;\n    cpu    = &quot;0.5&quot;\n    memory = &quot;1.5&quot;\n\n    ports {\n      port     = 6333\n      protocol = &quot;TCP&quot;\n    }\n  }\n<\/code><\/pre>\n<ul>\n<li>qdrant Container:\n<ul>\n<li><strong>name:<\/strong> Names the container as \u2018qdrant\u2019.<\/li>\n<li><strong>image:<\/strong> Verwendet das Qdrant-Image aus dem angegebenen GitHub Container Registry.<\/li>\n<li><strong>cpu and memory:<\/strong> Assigns 0.5 CPU cores and 1.5 GB memory to the container.<\/li>\n<li><strong>ports:<\/strong> Opens port 6333 for TCP connections to enable access to Qdrant.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Similar container definitions exist for mongodb, react, fastapi and nodejs. Each container has specific settings for image, resource allocation, ports and environment variables that are tailored to the respective requirements of the microservice.<\/p>\n<h3>Environment variables<\/h3>\n<p>Environment variables are defined within the container definitions in order to customise the configuration of the microservices.<\/p>\n<pre><code class=\"\" data-line=\"\">environment_variables = {\n      MONGO_INITDB_DATABASE      = &quot;dev4cloud&quot;\n      MONGODB_PORT               = &quot;27017&quot;\n      DISABLE_MONGO_AUTH         = &quot;true&quot;\n    }\n<\/code><\/pre>\n<h2>3. GitHub Actions Workflow<\/h2>\n<p>When implementing the GitHub Actions workflow for our project, we managed the individual repositories (backend, frontend, AI) in a clear organisational structure within our GitHub organisation. This allowed us to efficiently manage the specific requirements and dependencies for each component.<\/p>\n<p><strong>Setting up the repositories:<\/strong><\/p>\n<p>First, we created the repositories in our GitHub organisation and structured each one so that the backend, frontend and AI components are managed separately. This organisation facilitates modularity and enables better collaboration between the various development teams.<\/p>\n<p><strong>Creation of the workflow files:<\/strong><\/p>\n<p>For each repository, we configured individual deploy.yml files within the GitHub Actions. These files control the specific build and deployment process. We started with basic steps such as checking in the code (actions\/checkout@v2) and set up the environment to have all the necessary dependencies ready for the successful build and deployment.<\/p>\n<p><strong>Docker integration:<\/strong><\/p>\n<p>Since our project is based on Docker containers, we designed the workflow so that a new Docker image is created and automatically pushed to the GitHub container registry every time the main branch is pushed. This ensures that an up-to-date and executable version of our applications is always available.<\/p>\n<h3>Infrastructure Repo &#8211; Github Actions Runner<\/h3>\n<p>You can find the Github workflow file here: https:\/\/github.com\/software-dev-for-cloud-computing\/infrastructure\/blob\/main\/.github\/workflows\/deploy.yml<\/p>\n<h4>1. Trigger (on)<\/h4>\n<pre><code class=\"\" data-line=\"\">on:\n  push:\n    branches:\n      - main\n      - test_branch\n<\/code><\/pre>\n<p>The workflow starts automatically when code is pushed to the main or test_branch branches.<\/p>\n<h4>2. Job (jobs: terraform:)<\/h4>\n<pre><code class=\"\" data-line=\"\">jobs:\n  terraform:\n    runs-on: ubuntu-latest\n<\/code><\/pre>\n<p>runs-on: ubuntu-latest: The workflow is executed on a virtual machine with the latest Ubuntu operating system.<\/p>\n<h4>3. Steps (steps)<\/h4>\n<pre><code class=\"\" data-line=\"\">     - name: Checkout repository\n      uses: actions\/checkout@v2\n\n    - name: Login to GitHub Container Registry\n      uses: docker\/login-action@v2\n      with:\n        registry: ghcr.io\n        username: ${{ github.actor }}\n        password: ${{ secrets.CR_PAT }}\n\n    - name: Set up Terraform\n      uses: hashicorp\/setup-terraform@v1\n      with:\n        terraform_version: latest\n\n    - name: Azure login\n      uses: azure\/login@v1\n      with:\n        creds: ${{ secrets.AZURE_CREDENTIALS }}\n\n    - name: Install JQ\n      run: sudo apt-get install -y jq\n\n    - name: Terraform Init\n      run: terraform init -upgrade\n\n    - name: Terraform Plan\n      run: terraform plan -out=tfplan\n\n    - name: Terraform Apply\n      run: terraform apply -auto-approve tfplan\n\n    - name: Outputting Terraform Outputs to JSON file\n      run: |\n        echo &quot;Outputting Terraform Outputs to JSON file...&quot;\n        terraform output -json &gt; tf_outputs.json\n      continue-on-error: true\n\n    - name: Debugging Terraform Outputs on failure\n      if: failure()\n      run: |\n        echo &quot;Debugging Terraform Outputs on failure&quot;\n        echo &quot;Contents of tf_outputs.json:&quot;\n        cat tf_outputs.json\n      continue-on-error: true\n\n    - name: Upload Terraform Outputs\n      if: always()\n      uses: actions\/upload-artifact@v3\n      with:\n        name: tf_outputs\n        path: tf_outputs.json\n<\/code><\/pre>\n<ul>\n<li><strong>Checkout repository:<\/strong> Downloads the code of the infrastructure repository to the runner so that it can work on it.<\/li>\n<li><strong>Login to GitHub Container Registry:<\/strong> Logs in to the GitHub Container Registry to be able to download the required container images later. A GitHub user name and a personal access token (PAT), which is stored as a secret in the repository, are used for this purpose.<\/li>\n<li><strong>Set up Terraform:<\/strong>  Installs the latest version of Terraform on the Runner.<\/li>\n<li><strong>Azure login:<\/strong> Logs in to Azure to be able to create the Terraform resources in the Azure account. The login information is read from another secret.<\/li>\n<li><strong>Install JQ:<\/strong> Installs the command line tool jq, which is later used to parse JSON data.<\/li>\n<li><strong>Terraform Init:<\/strong> Initialises the Terraform working directory and downloads all required providers.<\/li>\n<li><strong>Terraform Plan:<\/strong> Creates an implementation plan that shows the changes Terraform would make to your infrastructure. The plan is saved in the tfplan file.<\/li>\n<li><strong>Terraform Apply:<\/strong> Applies the previously created plan and creates the resources in Azure.<\/li>\n<li><strong>Outputting Terraform Outputs to JSON file:<\/strong> Saves the outputs of Terraform (e.g. the IP address of your new virtual machine) in a JSON file called tf_outputs.json. (Currently no more outputs)<\/li>\n<li><strong>Upload Terraform Outputs:<\/strong> Upload the tf_outputs.json file as an artefact so that you can download and use it later.<\/li>\n<\/ul>\n<h3>Summary<\/h3>\n<p>This workflow automates the deployment of your infrastructure with Terraform in Azure. Whenever you make changes to your Terraform code and push them to main or test_branch, the workflow is executed and ensures that your infrastructure in Azure is up to date.<\/p>\n<h3>Why GitHub Actions?<\/h3>\n<p>We chose GitHub Actions because it offers seamless integration with our code repository. Everything, from code changes to automated deployments, happens directly on the platform where our code lives, without additional tools or complex setups.<\/p>\n<h4>Strengths:<\/h4>\n<p><strong>Seamless integration:<\/strong> GitHub Actions is perfectly integrated with GitHub, which means that our workflows are triggered directly with every push or pull request.<\/p>\n<p><strong>Flexibility:<\/strong> Each workflow could be customised to the needs of our repositories (backend, frontend, AI) without losing complexity.<\/p>\n<p><strong>Diverse actions:<\/strong> The GitHub Marketplace offers a wide range of pre-built actions that greatly simplify our workflow.<\/p>\n<h4>Weaknesses:<\/h4>\n<p><strong>Error handling:<\/strong> error logging is sometimes too simplistic, making it difficult to accurately identify issues in complex pipelines.<\/p>\n<p><strong>Limited monitoring:<\/strong> GitHub Actions offers basic monitoring, but lacks in-depth analysis and monitoring capabilities that specialised CI\/CD tools offer.<\/p>\n<p><strong>Platform dependency:<\/strong> The strong commitment to GitHub can be a disadvantage if parts of the project are hosted on other platforms.<\/p>\n<h1>4. Disadvantages and challenges<\/h1>\n<p>Even though our CI\/CD setup for the RAG system on Azure with Terraform and GitHub Actions offers many advantages, there were some notable challenges that we had to overcome.<\/p>\n<p><strong>Complexity:<\/strong><\/p>\n<p>Building a CI\/CD pipeline for a complex system like ours requires deep expertise. The combination of Terraform, Azure and GitHub Actions brings with it a certain complexity that should not be underestimated. In particular, setting up Infrastructure-as-Code (IaC) and automating the deployments required thorough planning and a lot of fine-tuning.<\/p>\n<p><strong>Maintenance effort:<\/strong><\/p>\n<p>A pipeline is not a \u2018once-and-done\u2019 project. It needs to be regularly updated and monitored to ensure that all components work together seamlessly. Any change in the infrastructure, codebases or tools used may require adjustments to the pipeline. This means that continuous maintenance and occasional troubleshooting sessions are necessary to keep operations running smoothly.<\/p>\n<p><strong>Security aspects:<\/strong><\/p>\n<p>Another critical issue is the protection of sensitive data. In an environment where we are continuously deploying to the cloud and interacting with various systems, the security of credentials and other sensitive information is of paramount importance. We had to ensure that our secrets were well protected and that access to critical systems was strictly controlled.<\/p>\n<h1>5. View to the left and right<\/h1>\n<h3>Alternative CI\/CD Tools<\/h3>\n<p>While GitHub Actions and Terraform offer a strong combination for CI\/CD and infrastructure-as-code, it\u2019s worth taking a look at alternative tools that may be more suitable depending on your specific requirements and preferences.<\/p>\n<ul>\n<li><strong>Jenkins:<\/strong> An open source veteran in the field of CI\/CD that impresses with its enormous flexibility and customisability. Jenkins offers a wide range of plugins that make it possible to automate almost any conceivable workflow. However, it requires a little more configuration effort than GitHub Actions.<\/li>\n<li><strong>GitLab CI\/CD:<\/strong> Tightly integrated into the GitLab platform, GitLab CI\/CD provides a seamless experience for teams already using GitLab for version control. It offers an easy-to-use YAML-based configuration and supports a variety of use cases.<\/li>\n<li><strong>CircleCI:<\/strong> A cloud-based CI\/CD platform known for its speed and scalability. CircleCI offers an intuitive user interface and supports a wide range of programming languages and frameworks.<\/li>\n<\/ul>\n<h3>Infrastructure-as-code tools<\/h3>\n<p>There would also be alternatives for Terraform. There are other platform-independent but also platform-dependent tools<\/p>\n<ul>\n<li><strong>Bicep:<\/strong> A domain-specific language (DSL) from Microsoft that was developed specifically for the provision of Azure resources. Bicep offers a more precise and declarative syntax than ARM templates, which improves readability and maintainability.<\/li>\n<li><strong>AWS CloudFormation:<\/strong> The native IaC tool from Amazon Web Services, ideal for managing AWS infrastructures. CloudFormation uses JSON or YAML templates to define resources and offers a variety of features for automating and orchestrating deployments.<\/li>\n<li><strong>Pulumi:<\/strong> A cloud-agnostic IaC platform that makes it possible to define infrastructures in various cloud providers (e.g. AWS, Azure, Google Cloud) with common programming languages such as TypeScript, Python or Go. Pulumi offers a high degree of flexibility and allows you to utilise the full power of your preferred programming language.<\/li>\n<\/ul>\n<h3>Next possible steps for the infrastructure<\/h3>\n<p>The application is executable with the IaC script mentioned above. However, several next steps should be considered for a project that should be online in the long term<\/p>\n<h3>Web apps or Azure Kubernetes Service (AKS) instead of containers<\/h3>\n<p>Whilst containers offer a flexible and portable way to package and deploy applications, there are scenarios where other services such as web apps or AKS may be more beneficial:<\/p>\n<ul>\n<li>\n<p><strong>Azure Web Apps:<\/strong> Ideal for simple web applications and APIs that require rapid deployment and scaling without having to worry about the underlying infrastructure. Web Apps offer automatic scaling, high availability and integrated CI\/CD functions.<\/p>\n<\/li>\n<li>\n<p><strong>Azure Kubernetes Service (AKS):<\/strong> Provides a managed Kubernetes platform for containerised applications that require high scalability, flexibility and control. AKS is particularly suitable for complex applications with multiple microservices or for scenarios where precise control over the infrastructure is required.<\/p>\n<\/li>\n<\/ul>\n<h3>Optimisations in the Terraform structure<\/h3>\n<p>The Terraform structure can be further improved by the following measures:<\/p>\n<ul>\n<li>\n<p><strong>Modularisation:<\/strong> splitting the Terraform configuration into smaller, reusable modules to improve readability, maintainability and reusability.<\/p>\n<\/li>\n<li>\n<p><strong>Variables and outputs:<\/strong> Use of variables and outputs to centralise configuration parameters and increase reusability.<\/p>\n<\/li>\n<\/ul>\n<h3>Integrate security options<\/h3>\n<p>The security of the infrastructure can be increased by the following measures:<\/p>\n<ul>\n<li>\n<p><strong>Azure Key Vault:<\/strong> Use of Azure Key Vault for secure storage of secrets and certificates to control access to sensitive data.<\/p>\n<\/li>\n<li>\n<p><strong>Azure VNETs:<\/strong> The use of virtual networks (VNETs) enables an isolated and secure network environment for the resources. By configuring NSGs, access to the front end can be restricted, while the other services are only accessible within the VNET.<\/p>\n<\/li>\n<\/ul>\n<h3>Load balancer (for large requests)<\/h3>\n<p>For large requests, a load balancer can be used to distribute the incoming traffic evenly across several instances of the application. This increases the scalability and availability of the application, especially during peak loads.<\/p>\n<h3>From GitHub Container Registry to Azure Container Registry: A step towards cloud integration<\/h3>\n<p>While GitHub Container Registry offers a convenient way to store container images close to your code, migrating to Azure Container Registry (ACR) can offer significant benefits in certain scales.<\/p>\n<h3>Advantages of using Azure Container Registry:<\/h3>\n<ul>\n<li>\n<p><strong>Deeper integration into the Azure cloud:<\/strong> ACR is natively integrated into the Azure platform and enables seamless collaboration with other Azure services such as Azure Kubernetes Service (AKS), Azure App Service and Azure Container Instances.<\/p>\n<\/li>\n<li>\n<p><strong>Global replication:<\/strong> ACR supports geographic replication of images to enable fast and reliable deployment across regions, reducing latency and improving performance<\/p>\n<\/li>\n<li>\n<p><strong>Private network connection:<\/strong> ACR can be connected to your virtual network via Azure Private Link to provide secure and private access to your container images without the need for public internet.<\/p>\n<\/li>\n<\/ul>\n<h3>From multi-repo to mono-repo<\/h3>\n<p>Another approach to improving efficiency and collaboration within the development process is to merge the four separate repositories (react, nodejs, fastapi, infrastructure) into one mono-repo.<\/p>\n<h3>Advantages of a mono repo<\/h3>\n<ul>\n<li><strong>Simplified dependency management:<\/strong> All project components are in one central location, making it easier to manage dependencies and track changes.<\/li>\n<li><strong>Atomic commits and code reviews:<\/strong> Changes to multiple components can be combined into a single commit, increasing clarity and making code reviews more efficient.<\/li>\n<li><strong>Sharing tools and configuration:<\/strong> Tools, scripts and configuration files can be shared across the project, leading to better consistency and reduced maintenance.<\/li>\n<li><strong>Promoting collaboration:<\/strong> Developers have insight into the entire code and can more easily contribute to different components, which promotes collaboration and knowledge sharing within the team.<\/li>\n<\/ul>\n<h3>Integration of end-to-end tests into the CI\/CD pipeline<\/h3>\n<p>To ensure the quality and reliability of our application, we integrate end-to-end tests (E2E tests) into our CI\/CD pipeline. E2E tests simulate the behaviour of a real user and check whether all components of the system work together as expected.\nAdvantages<\/p>\n<ul>\n<li><strong>Early error detection:<\/strong> Errors are recognised at an early stage before they reach production.<\/li>\n<li><strong>Faster feedback loops:<\/strong> Developers receive rapid feedback on the status of the application.<\/li>\n<li><strong>Higher quality and reliability:<\/strong> E2E tests help to improve overall quality.<\/li>\n<\/ul>\n<h1>6. Conclusion<\/h1>\n<p>The deployment of our RAG system on Azure using a CI\/CD pipeline was an enlightening experience that highlighted the challenges and benefits of modern DevOps practices.<\/p>\n<p><strong>Key Takeaways:<\/strong><\/p>\n<p>One of the most significant insights we gained is that a well-structured CI\/CD pipeline not only accelerates the development process but also significantly enhances the quality and reliability of the final product. By managing the different components of our application in separate repositories within a GitHub organization, we were able to ensure a clear division of responsibilities, leading to more efficient collaboration and faster issue resolution.<\/p>\n<p><strong>Final Recommendations and Tips:<\/strong><\/p>\n<p>For future projects, we recommend establishing a clear structure for repositories and workflows from the outset. It is also crucial to conduct regular reviews and updates of the pipeline to ensure it remains aligned with current requirements and technologies. Integrating security tests and carefully managing sensitive data within the pipeline should also be top priorities.<\/p>\n<p><strong>Best Practices for CI\/CD Pipelines:<\/strong><\/p>\n<p>Best practices include automating recurring tasks, regularly testing code in a staging environment, and ensuring that every step of the pipeline is reproducible and well-documented. Using feature branches for major changes and integrating them into the main branch through pull requests helps maintain a stable and clean codebase.<\/p>\n<p><strong>Key Considerations When Implementing RAG Systems:<\/strong><\/p>\n<p>When implementing RAG systems, it is important to consider the specific requirements for data processing and performance. Leveraging tools like Terraform for infrastructure management, in conjunction with a robust CI\/CD pipeline, enables scalable and secure operation of such systems. Additionally, focusing on resource optimization and minimizing latency is crucial to ensure a fast and reliable system.<\/p>\n<p>Overall, this project has demonstrated how critical a well-thought-out and effectively implemented CI\/CD pipeline is to the success of modern software projects.<\/p>\n<\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1205,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[1],"tags":[],"ppma_author":[1028],"class_list":["post-26411","post","type-post","status-publish","format-standard","hentry","category-allgemein"],"aioseo_notices":[],"aioseo_head":"\n\t\t<!-- All in One SEO 4.9.10 - aioseo.com -->\n\t<meta name=\"description\" content=\"Content RAG FastAPI The task Challenges Introduction to RAG The Journey Deciding on the technologies Easy Decisions The not so easy decisions Development Read in the PDF The text-splitter. Implementing the vector store Creating a retriever Creating retrieval chains Multi-Tenancy Improve Architecture The Architecture Extensibility Interchangeability Learnings Development Team Review Frotend How Did We Approach\" \/>\n\t<meta name=\"robots\" content=\"max-image-preview:large\" \/>\n\t<meta name=\"author\" content=\"Furkan Erdogan\"\/>\n\t<link rel=\"canonical\" href=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/\" \/>\n\t<meta name=\"generator\" content=\"All in One SEO (AIOSEO) 4.9.10\" \/>\n\t\t<meta property=\"og:locale\" content=\"en_US\" \/>\n\t\t<meta property=\"og:site_name\" content=\"Computer Science Blog\" \/>\n\t\t<meta property=\"og:type\" content=\"article\" \/>\n\t\t<meta property=\"og:title\" content=\"CoStudy (Rag in AWS) | Computer Science Blog @ HdM Stuttgart\" \/>\n\t\t<meta property=\"og:description\" content=\"Content RAG FastAPI The task Challenges Introduction to RAG The Journey Deciding on the technologies Easy Decisions The not so easy decisions Development Read in the PDF The text-splitter. Implementing the vector store Creating a retriever Creating retrieval chains Multi-Tenancy Improve Architecture The Architecture Extensibility Interchangeability Learnings Development Team Review Frotend How Did We Approach\" \/>\n\t\t<meta property=\"og:url\" content=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/\" \/>\n\t\t<meta property=\"og:image\" content=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2024\/09\/CoStudy2.jpeg\" \/>\n\t\t<meta property=\"og:image:secure_url\" content=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2024\/09\/CoStudy2.jpeg\" \/>\n\t\t<meta property=\"og:image:width\" content=\"1600\" \/>\n\t\t<meta property=\"og:image:height\" content=\"888\" \/>\n\t\t<meta property=\"article:published_time\" content=\"2024-09-15T13:48:44+00:00\" \/>\n\t\t<meta property=\"article:modified_time\" content=\"2024-09-15T13:48:47+00:00\" \/>\n\t\t<meta name=\"twitter:card\" content=\"summary\" \/>\n\t\t<meta name=\"twitter:title\" content=\"CoStudy (Rag in AWS) | Computer Science Blog @ HdM Stuttgart\" \/>\n\t\t<meta name=\"twitter:description\" content=\"Content RAG FastAPI The task Challenges Introduction to RAG The Journey Deciding on the technologies Easy Decisions The not so easy decisions Development Read in the PDF The text-splitter. Implementing the vector store Creating a retriever Creating retrieval chains Multi-Tenancy Improve Architecture The Architecture Extensibility Interchangeability Learnings Development Team Review Frotend How Did We Approach\" \/>\n\t\t<meta name=\"twitter:image\" content=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2024\/09\/CoStudy2.jpeg\" \/>\n\t\t<script type=\"application\/ld+json\" class=\"aioseo-schema\">\n\t\t\t{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/2024\\\/09\\\/15\\\/costudy-rag-in-aws\\\/#article\",\"name\":\"CoStudy (Rag in AWS) | Computer Science Blog @ HdM Stuttgart\",\"headline\":\"CoStudy (Rag in AWS)\",\"author\":{\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/author\\\/furkan_erdogan\\\/#author\"},\"publisher\":{\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/#organization\"},\"image\":{\"@type\":\"ImageObject\",\"url\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/wp-content\\\/uploads\\\/2024\\\/09\\\/CoStudy2.jpeg\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/2024\\\/09\\\/15\\\/costudy-rag-in-aws\\\/#articleImage\",\"width\":1600,\"height\":888},\"datePublished\":\"2024-09-15T15:48:44+02:00\",\"dateModified\":\"2024-09-15T15:48:47+02:00\",\"inLanguage\":\"en-US\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/2024\\\/09\\\/15\\\/costudy-rag-in-aws\\\/#webpage\"},\"isPartOf\":{\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/2024\\\/09\\\/15\\\/costudy-rag-in-aws\\\/#webpage\"},\"articleSection\":\"Allgemein, Furkan Erdogan\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/2024\\\/09\\\/15\\\/costudy-rag-in-aws\\\/#breadcrumblist\",\"itemListElement\":[{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de#listItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\",\"nextItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/category\\\/allgemein\\\/#listItem\",\"name\":\"Allgemein\"}},{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/category\\\/allgemein\\\/#listItem\",\"position\":2,\"name\":\"Allgemein\",\"item\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/category\\\/allgemein\\\/\",\"nextItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/2024\\\/09\\\/15\\\/costudy-rag-in-aws\\\/#listItem\",\"name\":\"CoStudy (Rag in AWS)\"},\"previousItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de#listItem\",\"name\":\"Home\"}},{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/2024\\\/09\\\/15\\\/costudy-rag-in-aws\\\/#listItem\",\"position\":3,\"name\":\"CoStudy (Rag in AWS)\",\"previousItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/category\\\/allgemein\\\/#listItem\",\"name\":\"Allgemein\"}}]},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/#organization\",\"name\":\"Computer Science Blog @ HdM Stuttgart\",\"description\":\"on computer science and media topics\",\"url\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/author\\\/furkan_erdogan\\\/#author\",\"url\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/author\\\/furkan_erdogan\\\/\",\"name\":\"Furkan Erdogan\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/2024\\\/09\\\/15\\\/costudy-rag-in-aws\\\/#authorImage\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/f630566cebf4abf71e9baf2a29a5d79325b1c6e45b1445a7dd1c345476e823c0?s=96&d=mm&r=g\",\"width\":96,\"height\":96,\"caption\":\"Furkan Erdogan\"}},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/2024\\\/09\\\/15\\\/costudy-rag-in-aws\\\/#webpage\",\"url\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/2024\\\/09\\\/15\\\/costudy-rag-in-aws\\\/\",\"name\":\"CoStudy (Rag in AWS) | Computer Science Blog @ HdM Stuttgart\",\"description\":\"Content RAG FastAPI The task Challenges Introduction to RAG The Journey Deciding on the technologies Easy Decisions The not so easy decisions Development Read in the PDF The text-splitter. Implementing the vector store Creating a retriever Creating retrieval chains Multi-Tenancy Improve Architecture The Architecture Extensibility Interchangeability Learnings Development Team Review Frotend How Did We Approach\",\"inLanguage\":\"en-US\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/#website\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/2024\\\/09\\\/15\\\/costudy-rag-in-aws\\\/#breadcrumblist\"},\"author\":{\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/author\\\/furkan_erdogan\\\/#author\"},\"creator\":{\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/index.php\\\/author\\\/furkan_erdogan\\\/#author\"},\"datePublished\":\"2024-09-15T15:48:44+02:00\",\"dateModified\":\"2024-09-15T15:48:47+02:00\"},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/#website\",\"url\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/\",\"name\":\"Computer Science Blog @ HdM Stuttgart\",\"description\":\"on computer science and media topics\",\"inLanguage\":\"en-US\",\"publisher\":{\"@id\":\"https:\\\/\\\/blog.mi.hdm-stuttgart.de\\\/#organization\"}}]}\n\t\t<\/script>\n\t\t<!-- All in One SEO -->\n\n","aioseo_head_json":{"title":"CoStudy (Rag in AWS) | Computer Science Blog @ HdM Stuttgart","description":"Content RAG FastAPI The task Challenges Introduction to RAG The Journey Deciding on the technologies Easy Decisions The not so easy decisions Development Read in the PDF The text-splitter. Implementing the vector store Creating a retriever Creating retrieval chains Multi-Tenancy Improve Architecture The Architecture Extensibility Interchangeability Learnings Development Team Review Frotend How Did We Approach","canonical_url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/","robots":"max-image-preview:large","keywords":"","webmasterTools":{"miscellaneous":""},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/#article","name":"CoStudy (Rag in AWS) | Computer Science Blog @ HdM Stuttgart","headline":"CoStudy (Rag in AWS)","author":{"@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/author\/furkan_erdogan\/#author"},"publisher":{"@id":"https:\/\/blog.mi.hdm-stuttgart.de\/#organization"},"image":{"@type":"ImageObject","url":"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2024\/09\/CoStudy2.jpeg","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/#articleImage","width":1600,"height":888},"datePublished":"2024-09-15T15:48:44+02:00","dateModified":"2024-09-15T15:48:47+02:00","inLanguage":"en-US","mainEntityOfPage":{"@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/#webpage"},"isPartOf":{"@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/#webpage"},"articleSection":"Allgemein, Furkan Erdogan"},{"@type":"BreadcrumbList","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/#breadcrumblist","itemListElement":[{"@type":"ListItem","@id":"https:\/\/blog.mi.hdm-stuttgart.de#listItem","position":1,"name":"Home","item":"https:\/\/blog.mi.hdm-stuttgart.de","nextItem":{"@type":"ListItem","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/#listItem","name":"Allgemein"}},{"@type":"ListItem","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/#listItem","position":2,"name":"Allgemein","item":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/","nextItem":{"@type":"ListItem","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/#listItem","name":"CoStudy (Rag in AWS)"},"previousItem":{"@type":"ListItem","@id":"https:\/\/blog.mi.hdm-stuttgart.de#listItem","name":"Home"}},{"@type":"ListItem","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/#listItem","position":3,"name":"CoStudy (Rag in AWS)","previousItem":{"@type":"ListItem","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/#listItem","name":"Allgemein"}}]},{"@type":"Organization","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/#organization","name":"Computer Science Blog @ HdM Stuttgart","description":"on computer science and media topics","url":"https:\/\/blog.mi.hdm-stuttgart.de\/"},{"@type":"Person","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/author\/furkan_erdogan\/#author","url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/author\/furkan_erdogan\/","name":"Furkan Erdogan","image":{"@type":"ImageObject","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/#authorImage","url":"https:\/\/secure.gravatar.com\/avatar\/f630566cebf4abf71e9baf2a29a5d79325b1c6e45b1445a7dd1c345476e823c0?s=96&d=mm&r=g","width":96,"height":96,"caption":"Furkan Erdogan"}},{"@type":"WebPage","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/#webpage","url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/","name":"CoStudy (Rag in AWS) | Computer Science Blog @ HdM Stuttgart","description":"Content RAG FastAPI The task Challenges Introduction to RAG The Journey Deciding on the technologies Easy Decisions The not so easy decisions Development Read in the PDF The text-splitter. Implementing the vector store Creating a retriever Creating retrieval chains Multi-Tenancy Improve Architecture The Architecture Extensibility Interchangeability Learnings Development Team Review Frotend How Did We Approach","inLanguage":"en-US","isPartOf":{"@id":"https:\/\/blog.mi.hdm-stuttgart.de\/#website"},"breadcrumb":{"@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/#breadcrumblist"},"author":{"@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/author\/furkan_erdogan\/#author"},"creator":{"@id":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/author\/furkan_erdogan\/#author"},"datePublished":"2024-09-15T15:48:44+02:00","dateModified":"2024-09-15T15:48:47+02:00"},{"@type":"WebSite","@id":"https:\/\/blog.mi.hdm-stuttgart.de\/#website","url":"https:\/\/blog.mi.hdm-stuttgart.de\/","name":"Computer Science Blog @ HdM Stuttgart","description":"on computer science and media topics","inLanguage":"en-US","publisher":{"@id":"https:\/\/blog.mi.hdm-stuttgart.de\/#organization"}}]},"og:locale":"en_US","og:site_name":"Computer Science Blog","og:type":"article","og:title":"CoStudy (Rag in AWS) | Computer Science Blog @ HdM Stuttgart","og:description":"Content RAG FastAPI The task Challenges Introduction to RAG The Journey Deciding on the technologies Easy Decisions The not so easy decisions Development Read in the PDF The text-splitter. Implementing the vector store Creating a retriever Creating retrieval chains Multi-Tenancy Improve Architecture The Architecture Extensibility Interchangeability Learnings Development Team Review Frotend How Did We Approach","og:url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/","og:image":"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2024\/09\/CoStudy2.jpeg","og:image:secure_url":"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2024\/09\/CoStudy2.jpeg","og:image:width":1600,"og:image:height":888,"article:published_time":"2024-09-15T13:48:44+00:00","article:modified_time":"2024-09-15T13:48:47+00:00","twitter:card":"summary","twitter:title":"CoStudy (Rag in AWS) | Computer Science Blog @ HdM Stuttgart","twitter:description":"Content RAG FastAPI The task Challenges Introduction to RAG The Journey Deciding on the technologies Easy Decisions The not so easy decisions Development Read in the PDF The text-splitter. Implementing the vector store Creating a retriever Creating retrieval chains Multi-Tenancy Improve Architecture The Architecture Extensibility Interchangeability Learnings Development Team Review Frotend How Did We Approach","twitter:image":"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2024\/09\/CoStudy2.jpeg"},"aioseo_meta_data":{"post_id":"26411","title":null,"description":null,"keywords":null,"keyphrases":{"focus":{"keyphrase":"","score":0,"analysis":{"keyphraseInTitle":{"score":0,"maxScore":9,"error":1}}},"additional":[]},"primary_term":null,"canonical_url":null,"og_title":null,"og_description":null,"og_object_type":"default","og_image_type":"default","og_image_url":null,"og_image_width":null,"og_image_height":null,"og_image_custom_url":null,"og_image_custom_fields":null,"og_video":"","og_custom_url":null,"og_article_section":null,"og_article_tags":null,"twitter_use_og":false,"twitter_card":"default","twitter_image_type":"default","twitter_image_url":null,"twitter_image_custom_url":null,"twitter_image_custom_fields":null,"twitter_title":null,"twitter_description":null,"schema":{"blockGraphs":[],"customGraphs":[],"default":{"data":{"Article":[],"Course":[],"Dataset":[],"FAQPage":[],"Movie":[],"Person":[],"Product":[],"ProductReview":[],"Car":[],"Recipe":[],"Service":[],"SoftwareApplication":[],"WebPage":[]},"graphName":"Article","isEnabled":true},"graphs":[]},"schema_type":"default","schema_type_options":null,"pillar_content":false,"robots_default":true,"robots_noindex":false,"robots_noarchive":false,"robots_nosnippet":false,"robots_nofollow":false,"robots_noimageindex":false,"robots_noodp":false,"robots_notranslate":false,"robots_max_snippet":"-1","robots_max_videopreview":"-1","robots_max_imagepreview":"large","priority":null,"frequency":"default","local_seo":null,"breadcrumb_settings":null,"limit_modified_date":false,"ai":null,"created":"2024-08-15 09:41:23","updated":"2024-09-15 13:48:47","seo_analyzer_scan_date":null},"aioseo_breadcrumb":"<div class=\"aioseo-breadcrumbs\"><span class=\"aioseo-breadcrumb\">\n\t\t\t<a href=\"https:\/\/blog.mi.hdm-stuttgart.de\" title=\"Home\">Home<\/a>\n\t\t<\/span><span class=\"aioseo-breadcrumb-separator\">&raquo;<\/span><span class=\"aioseo-breadcrumb\">\n\t\t\t<a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/\" title=\"Allgemein\">Allgemein<\/a>\n\t\t<\/span><span class=\"aioseo-breadcrumb-separator\">&raquo;<\/span><span class=\"aioseo-breadcrumb\">\n\t\t\tCoStudy (Rag in AWS)\n\t\t<\/span><\/div>","aioseo_breadcrumb_json":[{"label":"Home","link":"https:\/\/blog.mi.hdm-stuttgart.de"},{"label":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},{"label":"CoStudy (Rag in AWS)","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2024\/09\/15\/costudy-rag-in-aws\/"}],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":25800,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2023\/09\/14\/splid-2-0-die-zukunft-des-gemeinsamen-ausgabenmanagements\/","url_meta":{"origin":26411,"position":0},"title":"Splid 2.0 &#8211; Die Zukunft des gemeinsamen Ausgabenmanagements","author":"David Christoph Scheifers","date":"14. September 2023","format":false,"excerpt":"Im Rahmen der Vorlesung \u201cSoftware Development for Cloud Computing\u201d haben wir uns daf\u00fcr entschieden, einen Klon der App Splid auf Basis unterschiedlicher Cloud Technologien als Web App zu entwickeln, um uns so die Grundkenntnisse des Cloud Computings anzueignen. Projektidee Bei gemeinsamen Aktivit\u00e4ten und Gruppenausgaben ist es sehr hilfreich, einfache und\u2026","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2023\/09\/image6.jpg?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2023\/09\/image6.jpg?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2023\/09\/image6.jpg?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2023\/09\/image6.jpg?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2023\/09\/image6.jpg?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2023\/09\/image6.jpg?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":5147,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2019\/02\/26\/experiences-from-breaking-down-a-monolith\/","url_meta":{"origin":26411,"position":1},"title":"Experiences from breaking down a monolith (1)","author":"fr066","date":"26. February 2019","format":false,"excerpt":"Written by Verena Barth, Marcel Heisler, Florian Rupp, & Tim Tenckhoff The idea The search for a useful, simple application idea that could be realized within a semester project proved to be difficult. Our project was meant to be the base for several lectures and its development should familiarize us\u2026","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/02\/Bild_eingefugt_am_2019-02-26__1_31_PM-150x150.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/02\/Bild_eingefugt_am_2019-02-26__1_31_PM-150x150.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/02\/Bild_eingefugt_am_2019-02-26__1_31_PM-150x150.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/02\/Bild_eingefugt_am_2019-02-26__1_31_PM-150x150.png?resize=700%2C400&ssl=1 2x"},"classes":[]},{"id":28034,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/09\/12\/naniwa-not-actually-nutzlos-image-converter-with-aws\/","url_meta":{"origin":26411,"position":2},"title":"NANIWA &#8211; Not Actually Nutzlos Image-converter With AWS","author":"Kei Lam","date":"12. September 2025","format":false,"excerpt":"Introduction This is written as part of the assignment for Software Development for Cloud Computing. NANIWA, which stands for Not Actually Nutzlos Image-converter With AWS, or the name of the region which would become the city of Osaka today, is an image-conversion service hosted solely on Amazon Web Services. It\u2026","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/09\/image.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/09\/image.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/09\/image.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/09\/image.png?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/09\/image.png?resize=1050%2C600&ssl=1 3x"},"classes":[]},{"id":23679,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2022\/08\/31\/jobsuche-portal\/","url_meta":{"origin":26411,"position":3},"title":"Jobsuche Portal","author":"ag164","date":"31. August 2022","format":false,"excerpt":"SS22 - Dev4Cloud Projekt - von Robin H\u00e4rle und Anton Gerdts Ideenfindung \u00a0\u00a0\u00a0 Zu Beginn der Ideenfindungsphase f\u00fcr unser Projekt sahen wir uns die verschiedenen Apis auf Bund.dev an, um uns von der Thematik der verf\u00fcgbaren Daten inspirieren zu lassen. Wir entschieden uns ohne lange abzuw\u00e4gen daf\u00fcr ein Jobsuche-Portal mit\u2026","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":22151,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2022\/02\/22\/designing-and-implementing-a-scalable-web-application\/","url_meta":{"origin":26411,"position":4},"title":"Designing the framework for a scalable CI\/CD supported web application","author":"Danial Eshete","date":"22. February 2022","format":false,"excerpt":"Documentation of our approaches to the project, our experiences and finally the lessons we learned. The development team approaches the project with little knowledge of cloud services and infrastructure. Furthermore, no one has significant experience with containers and\/or containerized applications. However, the team is well experienced in web development and\u2026","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2022\/02\/Design_Desktop_Logged_In-3-150x150.jpg?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2022\/02\/Design_Desktop_Logged_In-3-150x150.jpg?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2022\/02\/Design_Desktop_Logged_In-3-150x150.jpg?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2022\/02\/Design_Desktop_Logged_In-3-150x150.jpg?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2022\/02\/Design_Desktop_Logged_In-3-150x150.jpg?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2022\/02\/Design_Desktop_Logged_In-3-150x150.jpg?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":28011,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/09\/11\/cloud-native-security-scanner\/","url_meta":{"origin":26411,"position":5},"title":"Cloud-native Security Scanner","author":"Tim Ruff","date":"11. September 2025","format":false,"excerpt":"Dieses Projekt wurde im Rahmen der Vorlesung \u201eSoftware Development for Cloud Computing\u201c umgesetzt. Ausgangslage und Projektidee Unser bisheriger Fokus im Studium lag haupts\u00e4chlich auf Themen der IT-Security und Machine Learning, weshalb wir beide bis auf die grundlegenden Vorlesungen zum Thema Software Entwicklung kaum Erfahrungen in diesem Bereich gesammelt haben. Aus\u2026","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/09\/image.jpeg?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/09\/image.jpeg?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/09\/image.jpeg?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/09\/image.jpeg?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/09\/image.jpeg?resize=1050%2C600&ssl=1 3x"},"classes":[]}],"jetpack_sharing_enabled":true,"authors":[{"term_id":1028,"user_id":1205,"is_guest":0,"slug":"furkan_erdogan","display_name":"Furkan Erdogan","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/f630566cebf4abf71e9baf2a29a5d79325b1c6e45b1445a7dd1c345476e823c0?s=96&d=mm&r=g","author_category":"","user_url":"","last_name":"Erdogan","first_name":"Furkan","job_title":"","description":""}],"_links":{"self":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts\/26411","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/users\/1205"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/comments?post=26411"}],"version-history":[{"count":5,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts\/26411\/revisions"}],"predecessor-version":[{"id":26647,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts\/26411\/revisions\/26647"}],"wp:attachment":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/media?parent=26411"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/categories?post=26411"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/tags?post=26411"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/ppma_author?post=26411"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}