browser-use

The browser-use repository provides an easy way to connect AI agents with the browser. It offers features like vision and HTML extraction, multi-tab management, custom actions, and parallelization of agents. It also provides configuration options for the browser and collects anonymous usage data for improvement. Contributions are welcome.

Unknown AI agents browser automation Python Updated: 2024-12-23

Article Generation Tool

A Node.js application that generates articles based on a given keyword by querying a search engine, summarizing results, and using AI. Can be run in command line or HTTP server mode. Can publish as WordPress draft or save as Markdown file.

Unknown article generation Node.js WordPress Updated: 2024-12-23

MarkItDown

MarkItDown is a utility for converting various files to Markdown. It supports multiple file formats including PDF, PowerPoint, Word, Excel, images, audio, HTML, text-based formats, and ZIP files. It can be installed via pip or from source, and can be used through command-line, Python API, or Docker. The project welcomes contributions and has detailed instructions for running tests and checks.

Unknown File conversion Markdown conversion Python API Updated: 2024-12-23

IdentityRAG

IdentityRAG is a retrieval-augmented generation system that integrates identity resolution capabilities to provide accurate, context-aware responses about specific customers. It unifies data from various sources, searches and retrieves relevant customer data, consolidates and deduplicates it.

Unknown Customer Data Integration Customer Insights Chatbot Retrieval-Augmented Generation Updated: 2024-12-23

Animated Drawings

This repository contains an implementation of an algorithm for animating children's drawings. It provides tools for creating animations, exporting videos and gifs, animating user's own drawings, fixing bad predictions, adding multiple characters and backgrounds, using different BVH files and skeletons, and creating custom config files. It also includes a browser-based demo, a citation to an accompanying paper, an amateur drawings dataset, and trained model weights.

Unknown animation children's drawings pose estimation Updated: 2024-12-23

Open RealtimeAPI Embedded SDK

An embedded SDK that has been tested on esp32s3 and linux. Can be built and used on these platforms. Allows setting target, configuring device settings, and building for either esp32s3 or linux.

Unknown embedded SDK ESP32-S3 linux Updated: 2024-12-23

Gemini API Cookbook

This is a collection of guides and examples for the Gemini API, including quickstart tutorials for writing prompts and using different features of the API, and examples of things you can build.

Python Gemini Large Language Models LLM Agent Updated: 2024-12-20 Google

Genesis

Genesis is a physics platform designed for general purpose Robotics/Embodied AI/Physical AI applications. It is simultaneously multiple things: A universal physics engine re-built from the ground up, capable of simulating a wide range of materials and physical phenomena. A lightweight, ultra-fast, pythonic, and user-friendly robotics simulation platform. A powerful and fast photo-realistic rendering system. A generative data engine that transforms user-prompted natural language description into various modalities of data.

Python & Physical AI Embodied AI Robotics Updated: 2024-12-20