DHOB (IU5SGN): LLama

Visualizzazione post con etichetta LLama. Mostra tutti i post

martedì 14 gennaio 2025

Cheshire Cat Ai

Cheshire Ai e' un progetto italiano che sta crescendo adesso

Per provarlo si clona il progetto

git clone https://github.com/cheshire-cat-ai/local-cat.git

su un x86 si puo' usare il file compose.yml per avere una installazione di Ollama, Cheshire AI e QDrant

Se non si ha una GPU NVidia si devono commentare le ultime righe

#    deploy:
#      resources:
#        reservations:
#          devices:
#            - driver: nvidia
#              count: all
#              capabilities: [ gpu ]

Su Mac invece conviene installare la app nativa di Ollama per M1 in modo da sfruttare la accelerazione Metal e si usa il file docker-compose-mac.yaml che crea un docker di Cheshire AI e QDrant

Per creare su x86 si usa

docker compose up -d

e poi si installano sul docker i modelli (in questo caso mistral)

docker exec ollama_cat ollama pull mistral:7b-instruct-q2_K

Per interagire con Chershire si punta a

http://localhost:1865/admin/

https://localhost:1865/public

Si deve configurare il language model in Cheshire

altrimenti nel caso di Apple, dove il server Ollama e' esterno al docker si usa l'indirizzo http://host.docker.internal:11434

Usando questo esempio https://cheshirecat.ai/local-embedder-with-fastembed/ si configura l'embedder usando quello interno in modo da girare tutto in locale

Per addestrare il modello tramite RAG e' sufficiente trascinare in PDF sull'interfaccia ed attendere il processamento

giovedì 2 gennaio 2025

LLama download checkpoint

Oltre a scaricare i dati per usare un modello puo' essere necessario effettuare un retrain di un modello ed in questo caso non si possono usare i modelli quantizzati ma i dati relativi ad checkpoint

per per fare cio' si usa llama-stack

Si deve pero' prima richiesta su https://www.llama.com/llama-downloads/

e si ottiene una con un link

si installa poi llama-stack

pip install llama-stack
llama model list

llama download --source meta --model-id meta-llama/Llama-3.2-3B

viene richiesto a questo punto di incollare il link giusto per mail

i file del checkpoint si trovano in .llama

luca@Dell:~$ cd .llama/
luca@Dell:~/.llama$ ls
checkpoints
luca@Dell:~/.llama$ cd checkpoints/
luca@Dell:~/.llama/checkpoints$ ls -la
total 12
drwxr-xr-x 3 luca luca 4096 Dec 31 06:24 .
drwxr-xr-x 3 luca luca 4096 Dec 30 17:30 ..
drwxr-xr-x 2 luca luca 4096 Dec 30 17:31 Llama3.2-3B
luca@Dell:~/.llama/checkpoints$ cd Llama3.2-3B/
luca@Dell:~/.llama/checkpoints/Llama3.2-3B$ ls
checklist.chk consolidated.00.pth params.json tokenizer.model
luca@Dell:~/.llama/checkpoints/Llama3.2-3B$ ls -la
total 6277140
drwxr-xr-x 2 luca luca       4096 Dec 30 17:31 .
drwxr-xr-x 3 luca luca       4096 Dec 31 06:24 ..
-rw-r--r-- 1 luca luca        156 Dec 30 17:31 checklist.chk
-rw-r--r-- 1 luca luca 6425581594 Dec 30 17:36 consolidated.00.pth
-rw-r--r-- 1 luca luca        220 Dec 30 17:31 params.json
-rw-r--r-- 1 luca luca    2183982 Dec 30 17:31 tokenizer.model
luca@Dell:~/.llama/checkpoints/Llama3.2-3B$

martedì 17 settembre 2024

RAG con Ollama Mistral e LangChain

Una altra prova usando questo repository https://github.com/CallumJMac/lessons

Il folder di riferimento e' lessons/1. RAG/examples/pixegami /PDF_files_langchain/rag-tutorial-v2-main

I files Pdf vanno messi nel folder data

Poi si lancia populate_database.py. La persistenza e' data da ChromaDB

import argparse
import os
import shutil
from langchain.document_loaders.pdf import PyPDFDirectoryLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.schema.document import Document
from get_embedding_function import get_embedding_function
from langchain.vectorstores.chroma import Chroma


CHROMA_PATH = "chroma"
DATA_PATH = "data"


def main():

    # Check if the database should be cleared (using the --clear flag).
    parser = argparse.ArgumentParser()
    parser.add_argument("--reset", action="store_true", help="Reset the database.")
    args = parser.parse_args()
    if args.reset:
        print("✨ Clearing Database")
        clear_database()

    # Create (or update) the data store.
    documents = load_documents()
    chunks = split_documents(documents)
    add_to_chroma(chunks)


def load_documents():
    document_loader = PyPDFDirectoryLoader(DATA_PATH)
    return document_loader.load()


def split_documents(documents: list[Document]):
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=800,
        chunk_overlap=80,
        length_function=len,
        is_separator_regex=False,
    )
    return text_splitter.split_documents(documents)


def add_to_chroma(chunks: list[Document]):
    # Load the existing database.
    db = Chroma(
        persist_directory=CHROMA_PATH,
        embedding_function=get_embedding_function()
    )

    # Calculate Page IDs.
    chunks_with_ids = calculate_chunk_ids(chunks)

    # Add or Update the documents.
    existing_items = db.get(include=[])  # IDs are always included by default
    existing_ids = set(existing_items["ids"])
    print(f"Number of existing documents in DB: {len(existing_ids)}")

    # Only add documents that don't exist in the DB.
    new_chunks = []
    for chunk in chunks_with_ids:
        if chunk.metadata["id"] not in existing_ids:
            new_chunks.append(chunk)

    if len(new_chunks):
        print(f"👉 Adding new documents: {len(new_chunks)}")
        new_chunk_ids = [chunk.metadata["id"] for chunk in new_chunks]
        db.add_documents(new_chunks, ids=new_chunk_ids)
        db.persist()
    else:
        print("✅ No new documents to add")


def calculate_chunk_ids(chunks):

    # This will create IDs like "data/monopoly.pdf:6:2"
    # Page Source : Page Number : Chunk Index

    last_page_id = None
    current_chunk_index = 0

    for chunk in chunks:
        source = chunk.metadata.get("source")
        page = chunk.metadata.get("page")
        current_page_id = f"{source}:{page}"

        # If the page ID is the same as the last one, increment the index.
        if current_page_id == last_page_id:
            current_chunk_index += 1
        else:
            current_chunk_index = 0

        # Calculate the chunk ID.
        chunk_id = f"{current_page_id}:{current_chunk_index}"
        last_page_id = current_page_id

        # Add it to the page meta-data.
        chunk.metadata["id"] = chunk_id

    return chunks


def clear_database():
    if os.path.exists(CHROMA_PATH):
        shutil.rmtree(CHROMA_PATH)


if __name__ == "__main__":
    main()

In seguito si puo' effettuare la query da linea di comando come

python query_data.py "what's monopoly"

import argparse
from langchain.vectorstores.chroma import Chroma
from langchain.prompts import ChatPromptTemplate
from langchain_community.llms.ollama import Ollama

from get_embedding_function import get_embedding_function

CHROMA_PATH = "chroma"

PROMPT_TEMPLATE = """
Answer the question based only on the following context:

{context}

---

Answer the question based on the above context: {question}
"""


def main():
    # Create CLI.
    parser = argparse.ArgumentParser()
    parser.add_argument("query_text", type=str, help="The query text.")
    args = parser.parse_args()
    query_text = args.query_text
    query_rag(query_text)


def query_rag(query_text: str):
    # Prepare the DB.
    embedding_function = get_embedding_function()
    db = Chroma(persist_directory=CHROMA_PATH, embedding_function=embedding_function)

    # Search the DB.
    results = db.similarity_search_with_score(query_text, k=5)

    context_text = "\n\n---\n\n".join([doc.page_content for doc, _score in results])
    prompt_template = ChatPromptTemplate.from_template(PROMPT_TEMPLATE)
    prompt = prompt_template.format(context=context_text, question=query_text)
    # print(prompt)

    model = Ollama(model="mistral")
    response_text = model.invoke(prompt)

    sources = [doc.metadata.get("id", None) for doc, _score in results]
    formatted_response = f"Response: {response_text}\nSources: {sources}"
    print(formatted_response)
    return response_text


if __name__ == "__main__":
    main()

giusto per dare un'idea questa e' la risposta

Response: Monopoly is a property trading game from Parker Brothers designed for ages 8 and up, suitable for 2 to 8 players. The gameboard is used along with tokens, houses, hotels, Chance and Community Chest cards, Title Deed cards, play money, and a Banker's tray. Players can choose to play by the classic rules or use the Speed Die for faster gameplay. In Monopoly, the objective is to become the wealthiest player by buying, renting, and selling properties.

Usando LLama3:7b la risposta e' stata

A simple one!

According to the context, Monopoly is a "Property Trading Game" from Parker Brothers.