[GPT][QUIZGPT]Output Parser 를 이용한 데이터 형태 제어
2024. 4. 18. 19:15
이전 페이지에서 Formatter Prompt 를 이용하여 원하는 형태로 만든것을 output parser 를 이용하는 형태로 변환
import json
from operator import rshift
from langchain.document_loaders import UnstructuredFileLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.callbacks import StreamingStdOutCallbackHandler
import streamlit as st
from langchain.retrievers import WikipediaRetriever
from langchain.schema import BaseOutputParser, output_parser
class JsonOutputParser(BaseOutputParser):
def parse(self, text):
text = text.replace("```", "").replace("json", "")
return json.loads(text)
output_parser = JsonOutputParser()
llm = ChatOpenAI(
def format_docs(docs):
return "\n\n".join(document.page_content for document in docs)
questions_prompt = ChatPromptTemplate.from_messages(
You are a helpful assistant that is role playing as a teacher.
Based ONLY on the following context make 10 questions to test the user's knowledge about the text.
Each question should have 4 answers, three of them must be incorrect and one should be correct.
Use (o) to signal the correct answer.
Question examples:
Question: What is the color of the ocean?
Answers: Red|Yellow|Green|Blue(o)
Question: What is the capital or Georgia?
Answers: Baku|Tbilisi(o)|Manila|Beirut
Question: When was Avatar released?
Answers: 2007|2001|2009(o)|1998
Question: Who was Julius Caesar?
Answers: A Roman Emperor(o)|Painter|Actor|Model
Your turn!
Context: {context}
questions_chain = {"context": format_docs} | questions_prompt | llm
formatting_prompt = ChatPromptTemplate.from_messages(
You are a powerful formatting algorithm.
You format exam questions into JSON format.
Answers with (o) are the correct ones.
Example Input:
Question: What is the color of the ocean?
Answers: Red|Yellow|Green|Blue(o)
Question: What is the capital or Georgia?
Answers: Baku|Tbilisi(o)|Manila|Beirut
Question: When was Avatar released?
Answers: 2007|2001|2009(o)|1998
Question: Who was Julius Caesar?
Answers: A Roman Emperor(o)|Painter|Actor|Model
Example Output:
{{ "questions": [
"question": "What is the color of the ocean?",
"answers": [
"answer": "Red",
"correct": false
"answer": "Yellow",
"correct": false
"answer": "Green",
"correct": false
"answer": "Blue",
"correct": true
"question": "What is the capital or Georgia?",
"answers": [
"answer": "Baku",
"correct": false
"answer": "Tbilisi",
"correct": true
"answer": "Manila",
"correct": false
"answer": "Beirut",
"correct": false
"question": "When was Avatar released?",
"answers": [
"answer": "2007",
"correct": false
"answer": "2001",
"correct": false
"answer": "2009",
"correct": true
"answer": "1998",
"correct": false
"question": "Who was Julius Caesar?",
"answers": [
"answer": "A Roman Emperor",
"correct": true
"answer": "Painter",
"correct": false
"answer": "Actor",
"correct": false
"answer": "Model",
"correct": false
Your turn!
Questions: {context}
formatting_chain = formatting_prompt | llm
@st.cache_data(show_spinner="Loading file...")
def split_file(file):
file_content =
file_path = f"./.cache/quiz_files/{}"
with open(file_path, "wb") as f:
splitter = CharacterTextSplitter.from_tiktoken_encoder(
loader = UnstructuredFileLoader(file_path)
docs = loader.load_and_split(text_splitter=splitter)
return docs
with st.sidebar:
docs = None
choice = st.selectbox(
"Choose what you want to use.",
"Wikipedia Article",
if choice == "File":
file = st.file_uploader(
"Upload a .docx , .txt or .pdf file",
type=["pdf", "txt", "docx"],
if file:
docs = split_file(file)
topic = st.text_input("Search Wikipedia...")
if topic:
retriever = WikipediaRetriever(top_k_results=5)
with st.status("Searching Wikipedia..."):
docs = retriever.get_relevant_documents(topic)
if not docs:
Welcome to QuizGPT.
I will make a quiz from Wikipedia articles or files you upload to test your knowledge and help you study.
Get started by uploading a file or searching on Wikipedia in the sidebar.
start = st.button("Generate Quiz")
if start:
chain = {"context": questions_chain} | formatting_chain | output_parser
response = chain.invoke(docs)