automatic-video-processing
yolov5
Our great sponsors
automatic-video-processing | yolov5 | |
---|---|---|
23 | 129 | |
72 | 46,921 | |
- | 3.3% | |
5.0 | 8.8 | |
about 2 years ago | 8 days ago | |
Python | Python | |
- | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
automatic-video-processing
-
Building an API + query language for rich data like images and video
I'm one of the creators of Sieve, and I'm looking for general thoughts on this problem.
- I built the easiest way to process and tag videos with AI
-
The YC Winter 2022 Batch
https://sievedata.com seems very promising, a search engine for videos, with specific tags, sounds like a very good idea.
I'd like the same for all my photos and videos: that would be so much easier to find specific pictures by keywords
-
Locally vs cloud stored management systems
The reason I ask is because I'm working on something called Sieve. We're basically making it really easy for any software developer to process and understand video content. This includes applications from home security, to pet monitoring, baby monitoring, sports analytics, and media understanding.
-
AI video understanding in games
Hey everyone! I'm the creator of Sieve, an API for AI-based video understanding. One application we're starting to support in beta is tracking player / object movements, speed, etc in video games. All you do is push video to our API, which we then process, after which you can search + query using API calls. We're starting by supporting a few popular games like League of Legends, Dota 2, CSGO, and Overwatch. Here are the docs.
-
Gauging sentiment in sales calls?
For context I'm the founder of a company called Sieve which is starting to work with some of these tools to automatically gauge things like attentiveness and facial expressions by automatically analyzing the video. Would be interesting to hear what you as users actually want.
-
[D] How computer vision will take over the world
P.S. I am potentially very bias because I'm working on Sieve which is trying to work with these applications.
-
Smart features that are actually helpful?
Hey everyone! I recently started building Sieve, a really easy way for devs to understand video content. We've just started to work with quite a few video editing tools / companies (both online and offline ones) after having primarily focused on real-world applications like security, supply chain, and general media.
-
[P] Sieve: Process 24 hours of video in 10 mins (UPDATE - try it yourself!)
Hey everyone! I’m one of the creators of Sieve. I posted about it here a while back and thought I'd share that r/MachineLearning can now try it for free :)
-
Launch HN: Sieve (YC W22) – Pluggable APIs for Video Search
Hi HN, we’re Mokshith and Abhi from Sieve (https://sievedata.com). We’re building an API that lets you add video search to internal tools or customer applications, instantly. Sieve can process 24 hours of video in less than 10 minutes, and makes it easy to search video by detected objects / characteristics, motion data, and visual similarity. You can use our models out of the box, or plug-in your own model endpoints into our infrastructure. Models can mean any software that produces output given an image.
Every industry from security, to media, supply chain, construction, retail, sports, and agriculture is being transformed by video analytics—but setting up the infrastructure to process video data quickly is difficult. Having to deal with video ingestion pipelines, computer-vision model training, and search functionality is not pretty. We’re building a platform that takes care of all of this so teams can focus on their domain-expertise, building industry-specific software.
We met in high school, and were on the robotics team together. It was our first exposure to computer vision, and something we both deeply enjoyed. We ended up going to UC Berkeley together and worked on computer vision at places like Scale AI, Niantic, Ford, NVIDIA, Microsoft, and Second Spectrum. We were initially trying to solve problems for ourselves as computer vision developers but quickly realized the unique problems in video having to do with cost, efficiency, and scale. We also realized how important video would be in lots of verticals, and saw an opportunity to build infrastructure which wouldn’t have to be rebuilt by a fullstack dev at any company again.
Let’s take the example of cloud software for construction which might include tons of features from asset trackers to rental management and compliance checks. It doesn’t make sense for them to build their own video processing for telematics—the density and scale of video make this a difficult task. A single 30 FPS camera generates over 2.5M frames within a day of recording. Imagine this across thousands of cameras and many weeks of footage—not to mention the actual vertical-specific software they’re building for end users.
Sieve takes care of everything hard about processing and searching video. Our API allows you to process and search video with just two API calls. We use filtering, parallelization, and interpolation techniques to keep costs low, while being able to process 24 hours of video in under 10 minutes. Users can choose from our pre-existing set of models, or use their own models with our video processing engine. Our pricing can range anywhere from $0.08-$0.45 per minute of video processed based on the models clients are interested in and usage volume. Our FAQ page (https://sievedata.com/faq) explains these factors in more detail.
Our backend is built on serverless functions. We split each video into individual chunks which are processed in parallel and passed through multiple layers of filters to determine which chunks are “important”. We’re able to algorithmically ignore parts of video which are static, or change minimally, and focus on the parts that contain real action. We then run more expensive models on the most “important” parts of video, and interpolate results across frames to return information to customers at 30 FPS granularity. Our customers simply push signed video URLs to our platform, and this happens automatically. You can then use our API to query for intervals of interest.
We haven’t built an automated sign up flow yet because of our focus on the core product, but we still wanted to give all of you the chance to try Sieve on your own videos for free. You’ll be emailed a personal, limited-access API key.
Try it out: https://sieve-data.notion.site/Trying-Sieve-s-Video-Search-4...
Visual dashboard demo: https://www.youtube.com/watch?v=_uyjp_HGZl4
We’d love to hear what you think about the product and vision, and ideas on how we can improve it. Thanks for taking the time to read this, we’re grateful to be posting here :)
yolov5
-
จำแนกสายพันธ์ุหมากับแมวง่ายๆด้วยYoLoV5
Ref https://www.youtube.com/watch?v=0GwnxFNfZhM https://github.com/ultralytics/yolov5 https://dev.to/gfstealer666/kaaraich-yolo-alkrithuemainkaartrwcchcchabwatthu-object-detection-3lef https://www.kaggle.com/datasets/devdgohil/the-oxfordiiit-pet-dataset/data
- How would i go about having YOLO v5 return me a list from left to right of all detected objects in an image?
-
Building a Drowsiness Detection Web App from scratch - pt2
!git clone https://github.com/ultralytics/yolov5.git ## Navigate to the model %cd yolov5/ ## Install requirements !pip install -r requirements.txt ## Download the YOLOv5 model !wget https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5s.pt
-
[Help: Project] Transfer Learning on YOLOv8
Specifically what I did was take the coco128.yaml, added 6 new classes from Dataset A (which have already been converted to YOLO Darknet TXT), from index 0-5 and subsequently adjusted the indices of the other COCO classes. The I proceeded to train and validate on Dataset A for 20 epochs.
-
Changing labels of default YOLOv5 model
I am using the default YOLOv5m6 model here with sahi/yolov5 library for my object detection project. I want to change just some of labels - for example when YOLO detects a human, I want it to label the human as "threat", not "person". Is there any way I can do it just changing some code, or I should train the model from scratch by just changing labels?
-
First time working with computer vision, need help figuring out a problem in my model
You should add them without annotations. Go through this.
-
AI Camera?
You are correct and if you check the firmware, it's yet another famous 3rd party project without attribution, namely https://github.com/ultralytics/yolov5
-
First non-default print on K1 - success
On one side, being a Linux user for 24 years now, it annoys me that they rip off code and claiming it as theirs again, thus violating licenses, but on the other thanks to k3d's exploit I'm able to tinker more with the machine and if needed do (selective) updates by hand then with a closed source system. It's not just "klipper", with klipper, fluidd and moonraker, it's also ffmpeg and mjpegstreamer. It's gonna be interesting since they also use a project that isn't just GPL, but APGL (in short "If your software gives service online, you have to publish the source code of it and any library that it borrows functions from.") - they use yolov5 (for AI).
- How does the background class work in object detection?
What are some alternatives?
nodejs-vision - Node.js client for Google Cloud Vision: Derive insight from images.
mmdetection - OpenMMLab Detection Toolbox and Benchmark
rpi-object-detection - Real-time object detection and tracking using Raspberry Pi and OpenCV!
detectron2 - Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
darknet - YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
Deep-SORT-YOLOv4 - People detection and optional tracking with Tensorflow backend.
yolor - implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)
OpenCV - Open Source Computer Vision Library
yolov5-crowdhuman - Head and Person detection using yolov5. Detection from crowd.
CenterNet - Object detection, 3D detection, and pose estimation using center point detection:
yolov3 - YOLOv3 in PyTorch > ONNX > CoreML > TFLite
edge-tpu-tiny-yolo - Run Tiny YOLO-v3 on Google's Edge TPU USB Accelerator.