Open Source Projects
browser-use
The browser-use repository provides an easy way to connect AI agents with the browser. It offers features like vision and HTML extraction, multi-tab management, custom actions, and parallelization of agents. It also provides configuration options for the browser and collects anonymous usage data for improvement. Contributions are welcome.
Article Generation Tool
A Node.js application that generates articles based on a given keyword by querying a search engine, summarizing results, and using AI. Can be run in command line or HTTP server mode. Can publish as WordPress draft or save as Markdown file.
MarkItDown
MarkItDown is a utility for converting various files to Markdown. It supports multiple file formats including PDF, PowerPoint, Word, Excel, images, audio, HTML, text-based formats, and ZIP files. It can be installed via pip or from source, and can be used through command-line, Python API, or Docker. The project welcomes contributions and has detailed instructions for running tests and checks.
IdentityRAG
IdentityRAG is a retrieval-augmented generation system that integrates identity resolution capabilities to provide accurate, context-aware responses about specific customers. It unifies data from various sources, searches and retrieves relevant customer data, consolidates and deduplicates it.
Animated Drawings
This repository contains an implementation of an algorithm for animating children's drawings. It provides tools for creating animations, exporting videos and gifs, animating user's own drawings, fixing bad predictions, adding multiple characters and backgrounds, using different BVH files and skeletons, and creating custom config files. It also includes a browser-based demo, a citation to an accompanying paper, an amateur drawings dataset, and trained model weights.
Open RealtimeAPI Embedded SDK
An embedded SDK that has been tested on esp32s3 and linux. Can be built and used on these platforms. Allows setting target, configuring device settings, and building for either esp32s3 or linux.
Gemini API Cookbook
This is a collection of guides and examples for the Gemini API, including quickstart tutorials for writing prompts and using different features of the API, and examples of things you can build.
Genesis
Genesis is a physics platform designed for general purpose Robotics/Embodied AI/Physical AI applications. It is simultaneously multiple things: A universal physics engine re-built from the ground up, capable of simulating a wide range of materials and physical phenomena. A lightweight, ultra-fast, pythonic, and user-friendly robotics simulation platform. A powerful and fast photo-realistic rendering system. A generative data engine that transforms user-prompted natural language description into various modalities of data.