Link List :: 2025-02-10
https://github.com/FareedKhan-dev/train-llm-from-scratch
This is a thorough analysis of two large language models (LLMs), one with 13 million parameters and another with 1 billion parameters. The author provides an in-depth comparison of their strengths and weaknesses, highlighting both the benefits and limitations of each model.
Key Takeaways:
- Smaller model can be effective: The 13 million-parameter model is able to generate clear and accurate text, even in longer contexts, making it a viable option for goal-oriented tasks.
- Bigger model requires deeper architecture: The 1 billion-parameter model’s ability to handle complex contexts and generate coherent text relies on a more sophisticated architecture that requires careful consideration of its design and training data.
- Overfitting is a risk: If the bigger model is not designed with sufficient depth and complexity, it may overfit the training data and fail to improve performance compared to smaller models.
- Fine-tuning can improve performance: Fine-tuning the 1 billion-parameter model on domain-specific data, such as writing emails or essays, can help improve its ability to generate high-quality text.
Recommendations:
- Create a smaller, goal-oriented LLM: Build an LLM with 13 million parameters that can handle specific tasks, such as generating email responses or writing essays.
- Fine-tune the bigger model: Train the 1 billion-parameter model on domain-specific data to improve its ability to generate high-quality text and adapt to new contexts.
Implications:
- Larger models are not always better: While larger models can handle more complex tasks, they require more computational resources and may overfit the training data if not designed carefully.
- Deeper architectures are crucial: A more sophisticated architecture is necessary to support the vast amount of parameters in large LLMs, ensuring that they can learn and generalize effectively.
Overall, this analysis highlights the importance of considering the trade-offs between model size, complexity, and performance when building large language models. By understanding these factors, developers can create effective models that balance computational efficiency with high-quality text generation capabilities.
https://github.com/grapeot/devin.cursorrules
- This repository gives you everything needed to supercharge your Cursor/Windsurf IDE or GitHub Copilot with advanced agentic AI capabilities at a fraction of the cost
- Automated planning and self-evolution, so your AI “thinks before it acts” and learns from mistakes
- Extended tool usage, including web browsing, search engine queries, and LLM-driven text/image analysis
- [Experimental] Multi-agent collaboration, with one doing the planning, and regular Claude/GPT-4 doing the execution
- Easy Setup: using Cookiecutter (recommended) or manual setup by copying files into project root folder
- Planner-Executor Multi-Agent (Experimental): a high-level Planner and an Executor that coordinate complex tasks
- Extended Toolset: includes web scraping, search engine integration, and LLM-powered analysis
- Self-Evolution: AI updates its “lessons learned” in .cursorrules, accumulating project-specific knowledge over time
- Start exploring advanced tasks with fully agentic capabilities
- License: MIT
https://dzone.com/articles/powering-llms-with-apache-camel-and-langchain4j
This is a comprehensive tutorial on using Apache Camel and LangChain4j to integrate Large Language Models (LLMs) into integration flows. Here’s a summary of the key points:
Overview
The tutorial introduces the concept of integrating LLMs with Apache Camel, a popular open-source integration framework. The focus is on leveraging the LangChain4j component, which provides a way to integrate LLMs with Camel routes.
https://dzone.com/articles/fresh-data-ai-spring-ai-function-calls
This article reviews how to integrate Spring Boot’s AI features into an Angular application using Angular Material Tree component.
- The Spring Boot side creates an API that accepts input from the user and responds with the results in different formats (e.g., plain text, JSON).
- It uses Spring Boot’s
@PostMappingannotation to map incoming requests to methods that calculate or retrieve data. - The
postLibraryFunctionmethod is used to send a request to the API, specifying the input question and desired response format.
Angular Side
- The Angular Material Tree component is used to display the results in a structured way.
- The tree component’s
dataSource,hasChild, andtreeControlproperties are set up to work with the structured data from Spring Boot. - The
searchmethod fetches new data every 100 milliseconds until it receives a response from the backend, displaying a loading message. - When the response is received, the results are mapped into the tree’s data source using
addToDataSourceandmapResult. - The
hasChildfunction checks if each node in the tree has children that can be opened.
https://github.com/lightpanda-io/browser
-
Lightpanda is an open-source, headless web browser written in Zig 0.13.0
-
Supports JavaScript execution and partial Web APIs (work-in-progress)
-
Compatible with Playwright and Puppeteer through CDP (work-in-progress)
-
Ultra-low memory footprint (9x less than Chrome) and fast execution (11x faster than Chrome) & instant startup
-
Fast web automation for AI agents, LLM training, scraping and testing
-
Compatible with HTML parser and DOM tree (based on Netsurf libs), Ajax, XHR API, Fetch API, DOM dump, and basic CDP/websockets server
https://dzone.com/articles/call-graphs-code-exploration-tree-sitter
- Tree-sitter is a powerful and performant parser generator library implemented in C and optimized for cross-platforms.
- It supports grammar for most popular high-level programming languages.
- It also supports bindings for multiple languages to be integrated with any type of application.
https://github.com/jbellis/llmap
- Tools like Aider and Cursor are great at editing code for you once you give them the right context, but finding that context automatically is largely an unsolved problem in large codebases.
- LLMap is a CLI code search tool designed to solve this by asking DeepSeek-V3 and DeepSeek-R1 to evaluate the relevance of each source file in your codebase to your problem.
- Until recently, this would be prohibitively expensive and slow, but DeepSeek-V3 is cheap, smart, fast, and allows multiple concurrent requests.
- LLMap performs analysis 500 files at a time, making it reasonably fast even for large codebases.
- The tool uses caching from DeepSeek to speed up repeated searches against the same files.
- It optimizes the problem by using a multi-stage analysis:
- Coarse analysis using code skeletons (DeepSeek-V3)
- Full source analysis of potentially relevant files
- Refining the output to only the most relevant snippets (DeepSeek-R1)
- Currently, Java and Python files are supported through skeletonization, but other languages can be processed without skeletonization.
- Contributions are welcome to extend parsing to other languages.
- Install with
pip install llmap-aiand obtain a DeepSeek API key from platform.deepseek.com. - Example command:
find src/ -name "*.java" | llmap "Where is the database connection configured?" - LLM APIs can be rate-limited, so LLMap caches responses in
~/.cache/llmap. - Print output to stdout or save to a file for use with AI chat tools.
- Errors are logged to stderr.
- Options include:
--no-refineand--no-skeletonsto adjust the analysis- Commandline parameters:
--sample,--llm-concurrency - Environment variables:
LLMAP_CACHEandLLMAP_ANALYZE_MODEL
https://huggingface.co/blog/smolagents-can-see
This is a comprehensive guide on how to create a CodeAgent that can interact with the web using a vision-enabled model, specifically the Qwen2VL via Fireworks API. The agent uses the Helium framework to interact with the web and capture screenshots.
Here’s a high-level overview of the steps involved:
- Setting up the environment: The guide sets up the necessary environment by installing Helium and smolagents.
- Defining the model: A Qwen2VL via Fireworks API model is defined using the OpenAIServerModel class from smolagents.
- Defining the agent: A CodeAgent is created with a model, tools (e.g.,
go_back,close_popups,search_item_ctrl_f), step callbacks (e.g.,save_screenshot), and verbosity levels set to 2 for full output messages. - Providing guidance: Helium instructions are provided to help the agent navigate the web and use its tools effectively.
- Running the agent: The agent is run with a sample task that requires it to navigate to a GitHub trending page, find the top author’s profile, and retrieve their total number of commits over the last year.
The guide also includes some general tips and best practices for using the CodeAgent, such as:
- Using
.exists()to check if an element exists - Not trying to solve tasks in one shot, but rather proceeding in several steps
- Avoiding code-based element searches like
find_all(S("ol > li"))and instead relying on visual inspection of the latest screenshot - Being cautious when using
time.sleep()to wait for pages to load
Overall, this guide provides a comprehensive introduction to creating a CodeAgent that can interact with the web using a vision-enabled model.
https://github.com/reflex-dev/reflex-llm-examples
- A curated repository of AI Apps built with Reflex
- Showcasing practical use cases of Large Language Models (LLMs)
- From providers such as Google, Anthropic, Open AI, and self-hosted open-source models
- Highlights:
- AI agents and their use cases
- RAG (Retrieval-Augmented Generation) implementations
- Best practices for building scalable AI-powered solutions
https://github.com/atyrode/gitemplate
- 🤖 Optimized for LLM integration and AI workflows
- 🚀 FastAPI for high-performance API development
- 🎨 Jinja2 templating engine
- 💅 TailwindCSS for utility-first styling
- ⚡️ Deploy your app in just an hour
- Production-Ready Structure: Organized project layout following best practices
- Modern Stack:
- FastAPI for high-performance APIs
- Jinja2 templating engine for the front-end
- TailwindCSS for modern styling
- Security:
- Built-in rate limiting
- Trusted host middleware
- Security headers configuration
- CORS configuration
- Environment variables management
- Deployment Ready:
- Docker support with multi-stage builds
- GitHub Actions workflows for:
- CI/CD pipeline
- PyPI publishing
- Docker image building
- Health check endpoints
- Production-grade logging
- Developer Experience:
- Pre-configured development tools:
- Black for code formatting
- isort for import sorting
- pylint for code analysis
- mypy for type checking
- pytest for testing
- Pre-commit hooks for code quality
- Type hints and comprehensive docstrings
- Pre-configured development tools:
- Hot reload during development
- Template System:
- Easy customization through
template.py - Flexible project structure
- Configurable dependencies
- Easy customization through
- Use this template by clicking “Use this template” on GitHub or clone it:
git clone https://github.com/atyrode/gitemplate.git - Create and activate a virtual environment:
python -m venv venvsource venv/bin/activate # On Windows:venv\Scripts\activate`` - Set up your project details in
template.py:- author: "Your Name"- package_name: "your_package"- project_name: "Your Project" - Apply the template:
python template.py - Run the development server:
cd srcpython -m uvicorn server.main:app --reload --host 0.0.0.0 --port 8000Visithttp://localhost:8000to see your application running! - pytest
- Contributions are welcome! Please read our Contributing Guidelines for details on how to submit pull requests, report issues, and contribute to the project.
https://timkellogg.me/blog/2025/01/25/r1
This article discusses the rapid progress in artificial intelligence (AI) research and its implications on geopolitics and software regulation. Here’s a summary of the key points:
Recent Breakthroughs
- R1, an AI model developed by DeepSeek, has achieved significant improvements in reasoning and inference speed, making it one of the most accurate models available.
- The article suggests that OpenAI’s GPT-4o is likely a distillation of R1, but the exact relationship between the two models is unclear.
Scaling Laws
The article identifies several scaling laws in AI research:
- Pre-training: Pre-training has become increasingly difficult, but not impossible.
- Inference scaling: Models are becoming smaller and faster, making them more efficient at inference time.
- Downsizing models: Smaller models are being developed that can produce similar results to larger models.
- Reinforcement learning (RL) scaling laws: RL is emerging as a key factor in improving model performance.
Model Distillation
The article discusses the concept of model distillation, where a student model is trained using training data from a teacher model. This process can help improve model performance and efficiency.
Geopolitics: Distealing
The author introduces the term “distealing,” referring to unauthorized distillation of AI models. The article notes that there are tensions between China and the USA over AI development, with China pursuing cheaper solutions while the USA invests heavily in AI research.
Strategies
The article outlines three strategies:
- USA: Invest heavily in AI research to stay ahead.
- China: Focus on developing cheaper AI solutions using smart engineers and researchers.
- Europe: Regulate or open-source AI to ensure accountability and fairness.
Conclusion
The article concludes that the rapid progress in AI research is accelerating rapidly, and the future of AI is more clear than ever before. The main takeaway is that R1 provides clarity where OpenAI was previously opaque, making it a significant milestone in the development of AI.
https://github.com/yoheinakajima/mindgraph
- MindGraph is a proof-of-concept starter kit that demonstrates the capabilities of a graph database and integration framework.
- It includes a Python module that provides a flexible and modular way to integrate with different databases.
https://github.com/yoheinakajima/instagraph
-
InstaGraph: A Flask Application for Converting Text or URLs into Insightful Knowledge Graphs
- Powered by OpenAI’s GPT-3.5, this application generates vividly colored graphs to visualize relationships between entities.
- Features:
- Dynamic Text to Graph conversion
- Color-coded graph nodes and edges
- Responsive design
- Super-duper user-friendly
https://www.agentrecipes.com/
- Explore common agent recipes with ready-to-copy code for improved LLM applications, largely inspired by Anthropic’s article.
- Prompt chaining: a sequential design allowing structured reasoning and step-by-step task completion through the output of one LLM call becoming input for the next.
- Routing: a low-latency workflow where inputs are dynamically routed to the most appropriate LLM instance or configuration for optimized efficiency and specialization.
- Parallelization: a workflow that distributes tasks across multiple LLM calls simultaneously, aggregating results for efficient handling of complex or large-scale operations.
- Orchestrator-workers: a workflow with a central orchestrator directing multiple worker LLMs to perform subtasks, synthesizing their outputs for complex, coordinated operations.
- Evaluator-optimizer: a feedback loop workflow where LLM-generated outputs are evaluated, refined, and optimized iteratively to improve accuracy and relevance.
- Autonomous Agent: an agent-based workflow where LLMs act autonomously within a loop, interacting with their environment and receiving feedback to refine their actions and decisions.
https://github.com/urcades/ollama-interface
- What is this?: A super simple html + css + js ollama chat interface for hacking on.
- Why?: Most of the ollama chat interfaces out there were quite “heavy” and involved use of the javascript and python package ecosystem, or alternatively, docker.
- How do I get started?:
- Clone this repo
- Run python -m http.server 8888 (or similar) to get started prompting
- http://localhost:8888 should show the chat interface.
https://newsletter.languagemodels.co/p/the-illustrated-deepseek-r1
This is an explanation of the DeepSeek-R1 model, a large language model that excels in reasoning and mathematical problem-solving. Here’s a breakdown of the key points:
https://old.reddit.com/r/Python/comments/1ibbbib/multicharting_live_streaming_tool_for_ibkr/
- It’s a Python-based trading/charting tool dashboard created after 4 years of development.
- Features:
- Live data and candlestick charting with update intervals
- Multi-charting up to 6 charts per screen (multiple tabs)
- Bloomberg news stream on the home page
- Ticker search functionality for IBKR offerings
- Indicators in Typescript, customizable
- Currently only supports IBKR data feeds
https://github.com/yashgoenka/chat-apple-notes
- Demo: Chat Apple Notes in action (video)
- Vector Store Integration:
- Extracts Apple Notes via AppleScript
- Creates embeddings using OpenAI’s vector store
- Semantic Search:
- Queries note embeddings for contextually relevant matches
- RAG-based Queries:
- Leverages note embeddings for context-aware question answering
- Conversational Interface:
- Maintains chat context through OpenAI’s thread management
https://block.github.io/goose/
- Your on-machine AI agent automating engineering tasks seamlessly.
- Open Source: Built with transparency and collaboration in mind, empowering developers to contribute, customize, and innovate freely.
- Runs Locally: Executes tasks efficiently, keeping control in your hands.
- Extensible: Customize goose with your preferred LLM and enhance its capabilities by connecting it to any external MCP server or API.
- Autonomous: Goose independently handles complex tasks, freeing you to focus on what matters most.
https://www.interconnects.ai/p/why-reasoning-models-will-generalize
The post discusses the emergence of “reasoning” models in natural language processing (NLP) and their potential impact on the field. Here’s a summary of the key points:
https://github.com/Upsonic/Upsonic
- Upsonic offers a cutting-edge enterprise-ready framework for orchestrating LLM calls, agents, and computer use to complete tasks cost-effectively.
- Key features:
- Production-Ready Scalability: deploy on AWS, GCP, or locally using Docker.
- Task-Centric Design: focus on practical task execution with options for basic tasks via LLM calls, advanced tasks with V1 agents, complex automation using V2 agents with MCP integration, and more.
- MCP Server Support: utilize multi-client processing for high-performance tasks.
- Tool-Calling Server: exception-secure tool management with robust server API interactions.
- Computer Use Integration: execute human-like tasks using Anthropic’s ‘Computer Use’ capabilities.
- Easily add custom tools with a single line of code.
- Client-server architecture: production-ready stateless enterprise-ready system
https://github.com/codegen-sh/codegen-sdk
- Codegen is a python library for manipulating codebases.
- It builds a complete graph connecting functions, classes, imports, and their relationships.
- It provides comprehensive static analysis for references, dependencies, etc.
- It auto-handles references and imports to maintain correctness.
- It supports multi-language code manipulation at scale using Tree-sitter and rustworkx algorithms.
- Supported languages: Python, TypeScript, Javascript, React
- Supported platforms: macOS, Linux (with Apple Silicon and x86_64/aarch64), Windows not supported
- Installation: pip install codegen, tool install codegen for global CLI
- Features:
- Create codemods with
codegen initandcodegen create - Run codemods with
codegen run - Open isolated Jupyter notebooks with
codegen notebook
- Create codemods with
- Built on real-world refactors performed on enterprise codebases.
https://github.com/browser-use/macOS-use
- Enables AI agents to interact with MacBook via macOS-use
- Install with
pip install mlx-useand clone repository from https://github.com/browser-use/macOS-use - Requires API key from OAI, Anthropic or Gemini providers (with limitations for Gemini)
- Supports uv environment with
brew install uv && uv venv - Example usage:
- Calculate result of “5 X 4”
- Login to Auth0 using Google account
- Check current hour in Israel
- Project aims to build AI agent for MLX framework, aiming for open source and SOTA reliability
- Future features include:
- Refine prompting and self-correction
- Release first working version to PyPI
- Improve efficiency and cost
- Support iPhone/iPad devices
- Fine-tune small models for local inference
https://github.com/Om-Alve/smolGPT
Minimal Codebase
- Pure PyTorch implementation with no abstraction overhead
Modern Architecture
- GPT model with:
- Flash Attention (when available)
- RMSNorm and SwiGLU
- Efficient top-k/p/min-p sampling
- Rotary embeddings - RoPE (Optional)
Training Features
- Mixed precision (bfloat16/float16)
- Gradient accumulation
- Learning rate decay with warmup
- Weight decay & gradient clipping
Dataset Support
- Built-in TinyStories dataset processing
Custom Tokenizer
- SentencePiece tokenizer training integration
https://www.philschmid.de/mini-deepseek-r1
This is a detailed report on a research experiment using Reinforcement Learning (RL) with the DeepSeek R1 methodology, specifically targeting the Countdown Game task. The experiment uses the GRPO (Generalized Reward Proximal Optimize) algorithm and two rule-based rewards to train a 3B parameter model on the game. Here’s a summary of the report:
Methodology
- The experiment uses the GRPO algorithm with two rule-based rewards: one for the correct format
<think>...</think>\n<answer>...</answer>and another for successful equation solving. - The reward functions are not explicitly defined, instead relying on the model to learn through trial and error.
- The model is trained using a 3B parameter base model (Qwen 2.5 3B).
- The experiment uses 4 H100 GPUs in distributed training mode, with a total of 6 hours of compute time.
Results
- At around 50 steps, the model learns to recognize and use the correct format
<think>...</think>\n<answer>...</answer>. - By step 100, the success rate for solving equations reaches around 25%.
- By step 200, the performance converges, with a success rate of around 40%. The model starts to “reason” using words.
- At step 450, the success rate has increased to around 50%, but the performance still improves slowly. The model adopts a new format, similar to programmatic execution.
https://github.com/plexe-ai/smolmodels
- Build machine learning models using natural language and minimal code
- Create machine learning models by describing what you want them to do in plain words
- The library builds a model for you, including data generation, feature engineering, training, and packaging
- Note: This library is in early development, report bugs or share feature requests on GitHub or Discord
- Installation:
- pip install smolmodels
https://multiagentbook.com/labs/usecases/
- The guide provides a comprehensive guide to building, evaluating, and deploying multi-agent systems
- Explore real-world examples of multi-agent systems across different frameworks and domains
https://github.com/rohitg00/manim-video-generator
- A web-based tool for generating mathematical animations using Manim, Flask, and OpenAI.
- Generate mathematical animations from text descriptions.
- Modern, responsive web interface.
- Real-time code preview with syntax highlighting.
- Support for various mathematical concepts (Basic geometry and algebra, Calculus concepts, 3D visualizations, Matrix operations, Complex numbers, Differential equations).
- Easy-to-use example prompts.
- Docker support for easy deployment.
https://github.com/Razikus/supabase-nextjs-template
- A production-ready SaaS template built with Next.js 15, Supabase, and Tailwind CSS, providing authentication, user management, file storage, and more.
- Demo: https://basicsass.razikus.com
- Video: https://www.youtube.com/watch?v=kzbXavLndmE
- SupaNuggets series - 50 mini apps available for free at https://supanuggets.razikus.com (Pay As You Want model)
- Authentication:
- Email/Password authentication
- Multi-factor authentication (MFA) support
- OAuth/SSO integration ready
- Password reset and email verification
- User Management:
- User profiles and settings
- Secure password management
- Session handling
- File Management Demo (2FA ready):
- Secure file upload and storage
- File sharing capabilities
- Drag-and-drop interface
- Progress tracking
- Task Management Demo (2FA ready):
- CRUD operations example
- Real-time updates
- Filtering and sorting
- Row-level security
- Security:
- Row Level Security (RLS) policies
- Secure file storage policies
- Protected API routes
- MFA implementation
- UI/UX:
- Modern, responsive design
- Dark mode support
- Loading states
- Error handling
- Toast notifications
- Confetti animations
- Legal & Compliance:
- Privacy Policy template
- Terms of Service template
- Refund Policy template
- GDPR-ready cookie consent
- Frontend:
- Next.js 15 (App Router)
- React 19
- Tailwind CSS
- shadcn/ui components
- Lucide icons
- Backend:
- Supabase
- PostgreSQL
- Row Level Security
- Storage Buckets
- Authentication with Supabase Auth:
- MFA support
- OAuth providers
- Fork or clone repository
- Prepare Supabase Project URL and database password
- Contributing:
- Contributions welcome!
- Paid template available at https://sasstemplate.razikus.com (with Paddle + organisations API keys + multiple organisations + Role Based Access Control)
- 50% off for code GitHub purchase: https://razikus.gumroad.com/l/supatemplate/GITHUB
- Licensing:
- Apache License - see LICENSE file for details
https://github.com/jamespilgrim/PiTrac
- Introducing PiTrac - the world’s first (free!) open-source golf launch monitor, built with Raspberry Pi computers and cameras.
- PiTrac uses off-the-shelf hardware, including a parts list with links to potential suppliers.
- The full design is being released as open source on GitHub for folks to build themselves.
- PiTrac interfaces with GSPro and E6/TruGolf simulators, and its output is also accessible on a stand-alone web-based app.
- Two Raspberry Pi computers and cameras are the most expensive parts, costing around $250 in total.
- The project is not commercial and is seeking community developers to help test and continue development.
- Improving documentation and design to make it easier for others to build their own PiTracs.
- A GitHub repository with 3D printed part designs, hardware design, and initial software documentation.
- Support page for continued work and release process.
- PiTrac Discord server for community discussion, help, and show-off your build.
- Initial ecosystem is still developing, but the Discord is a way to clear up any fuzzy parts.
- Building your own PiTrac DIY Launch Monitor requires soldering skills, 3D printing knowledge, and Linux expertise.
https://github.com/pydantic/pydantic.run
- Python browser sandbox based on Pyodide, allowing write and share Python code execution in the browser
- Built with Pydantic, PydanticAI, and Pydantic Logfire
- Dependencies are installed when code is run or inferred from imports
- Dependencies can be defined explicitly in a comment at the top of the file (PEP 723)
- Example:
# /// scriptfollowed bydependencies = ["pydantic", "email-validator"] - Version pinning for non-binary packages allowed (e.g.
rich<13) - Programmatically create a sandbox by sending a GET request to https://pydantic.run/new
- Request parameters:
- files: an array of objects with name, content, and optionally activeIndex and tab properties
- activeIndex: specifies the file/tab open by default (highest value wins)
- tab: selects which tab is open by default
- Sandbox creation URL generation example provided in minimal HTML page
https://www.reddit.com/r/Python/comments/1ieq2sn/my_first_python_code_on_nfl_data_visualization/
- My First Python code on NFL Data Visualization
- Uses football player tracking data collected through the NFL Big Data Bowl to create interactive visualizations
- Combines programming and sports interests for real-time data analysis and visualization
- Features:
- Play visualizations on the field in real-time
- Interactive statistics analysis of plays and key player stats
- Team performance insights into strategies based on game data
- Technologies and tools used:
- Python
- Pandas and NumPy
- Matplotlib and Seaborn
- Plotly
- Jupyter Notebooks
https://github.com/maxbbraun/trump2cash
- This bot watches Donald Trump’s tweets and waits for him to mention any publicly traded companies.
- It uses sentiment analysis to determine whether his opinions are positive or negative toward those companies.
- The bot then automatically executes trades on the relevant stocks according to the expected market reaction.
- Tweets out a summary of its findings in real time at @Trump2Cash.
https://github.com/TheBlewish/Automated-AI-Web-Researcher-Ollama
- Automated-AI-Web-Researcher is an innovative research assistant that leverages locally run large language models through Ollama to conduct thorough, automated online research on any given topic or question.
- Unlike traditional LLM interactions, this tool performs structured research by breaking down queries into focused research areas, systematically investigating each area via web searching and scraping relevant websites, and compiling its findings.
- The findings are automatically saved into a text document with all the content found and links to the sources.
- Whenever you want it to stop its research, you can input a command, which will terminate the research.
- The LLM will then review all of the content it found and provide a comprehensive final summary of your original topic or question.
- Afterward, you can ask the LLM questions about its research findings.
- Key features:
- Automated research planning with prioritized focus areas
- Systematic web searching and content analysis
- All research content and source URLs saved into a detailed text document
- Research summary generation
- Post-research Q&A capability about findings
- Self-improving search mechanism
- Rich console output with status indicators
- Comprehensive answer synthesis using web-sourced information
- Research conversation mode
- Ollama is used as the local LLM runtime
- DuckDuckGo’s search API is integrated for searching
- Recommended models: phi3:3.8b-mini-128k-instruct, orphi3:14b-medium-128k-instruct (with custom context length)
- This project is licensed under the MIT License
- Contributions are welcome!
- The tool represents an attempt to bridge the gap between simple LLM interactions and genuine research capabilities.
https://huggingface.co/blog/open-deep-research
This is an open-source project aimed at building an agentic framework for humans to interact with artificial intelligence (AI) systems. Here’s a summary of the project:
Goals:
- Build an open-source agentic framework that allows humans to interact with AI systems in a more intuitive way.
- Leverage the power of open research to build a great open-source agentic framework.
Key Components:
- Web Browser: A simple text-based web browser is used as a starting point for our proof-of-concept.
- Text Inspector: A tool to read and inspect text files, taken from Microsoft Research’s Magentic-One agent.
- Code-Native Agent: Agents write their actions in code instead of JSON, which leads to improved performance.
https://simonwillison.net/2025/Feb/5/s1-the-6-r1-competitor/
- The $6 R1 Competitor? Tim Kellogg shares his notes on a new paper, s1: Simple test-time scaling, which describes an inference-scaling model fine-tuned on top of Qwen2.5-32B-Instruct for just $6.
- After sifting their dataset of 56K examples down to just the best 1K, they found that the core 1K is all that’s needed to achieve o1-preview performance on a 32B model.
- The paper describes a technique called “Budget forcing”:
- To enforce a minimum, suppress the generation of the end-of-thinking token delimiter
- Optionally append the string “Wait” to the model’s current reasoning trace to encourage reflection
- This is the same trick previously described by Theia Vogel
- A 32B model on Hugging Face can be run using Ollama like so: ollama run hf.co/brittlewis12/s1-32B-GGUF:Q4_0
- 1,000 samples are available in the simplescaling/s1K data repository on Hugging Face
- CSV file conversion and manipulation was done using DuckDB and sqlite-utils
- Recent articles:
- Using pip to install a Large Language Model that’s under 100MB - 7th February 2025
- OpenAI o3-mini, now available in LLM - 31st January 2025
- A selfish personal argument for releasing code as Open Source - 24th January 2025
https://github.com/n8n-io/self-hosted-ai-starter-kit
- An open-source Docker Compose template for initializing a local AI and low-code development environment
- Curated by n8n-io, combining self-hosted n8n platform with compatible AI products and components
- Includes:
- Self-hosted n8n - Low-code platform with over 400 integrations and advanced AI components
- Ollama - Cross-platform LLM platform for local LLMs
- Qdrant - Open-source vector store with comprehensive API
- PostgreSQL - Workhorse of the Data Engineering world, handling large amounts of data safely
- Pre-configured Docker Compose file with network and storage settings
https://github.com/suvakov/chargraph
- Let’s try a small experiment with LLMs: feed an entire book into the context window and ask it to generate a list of characters, their relationships, and physical descriptions—data that can later be used for image generation.
- Script chargraph.py is used to extract characters and relationships.
- Supports Gemini and OpenRouter API
- Character images were generated using portrait_prompt from JSON.
- Model: Gemini 2.0 Flash Exp
- Specs: 1M token context window, 8K token output limit
- Results are visualized in HTML/JS using D3
- Uses text files downloaded from Project Gutenberg
- Small books (Tom Sawyer and Peter Pan) are processed surprisingly well.
- Iterative approach helps refine results and adds some missing links and characters.
- 8K token output limit is a bottleneck, making it challenging to process large books like Les Misérables.
- Multiple copies of a book don’t help with results when possible to fit in the prompt.
- Improve prompt
- Test other large context window models.
- Find ‘ground truth’ character networks using more sophisticated analysis and use it as benchmark for large context models.
- Try it on legal documents (affidavits, indictments), historical documents, and movie/TV show scripts.
https://www.superagent.sh/blog/reag-reasoning-augmented-generation
- ReAG (Reasoning-Augmented Generation): A method that skips traditional retrieval steps and directly feeds raw materials to language models for synthesis of answers in one go.
- Traditional RAG systems have three core issues:
- Semantic search isn’t smart enough.
- Infrastructure complexity.
- Static knowledge.
- ReAG operates on a simple idea: Let the language model do the heavy lifting by analyzing raw documents and asking two questions: Is this document useful for the task? What specific parts of it matter?
- This approach mirrors how humans research: Skim sources, discard irrelevant ones, and focus on passages that address our specific question.
- ReAG is like a scholar, reading every book in full, underlining relevant paragraphs, and synthesizing insights based on the query’s deeper intent.
- Key design choices:
- Document-Level Analysis: The LLM processes entire documents, preserving cross-paragraph context.
- Dynamic Prompts: System instructions focus the model’s reasoning.
- ReAG’s strengths become clear in scenarios where context matters more than speed:
- Complex, open-ended queries
- Dynamic data
- Multimodal data
- Cheaper, faster language models and larger context windows will improve ReAG’s viability.
- Hybrid systems may emerge to balance speed and accuracy.
- Conclusion: ReAG is about rethinking how language models interact with knowledge, aligning better with human analysis methods.
https://github.com/neoneye/PlanExe
- PlanExe is a planning AI. You input a vague description of what you want and PlanExe outputs a plan.