nodejs-vision
automatic-video-processing
nodejs-vision | automatic-video-processing | |
---|---|---|
48 | 23 | |
494 | 72 | |
- | - | |
7.3 | 5.0 | |
over 1 year ago | about 2 years ago | |
TypeScript | Python | |
Apache License 2.0 | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
nodejs-vision
-
GitHub Is Sued, and We May Learn Something About Creative Commons Licensing
Google also used all of this to improve their OCR algorithms, almost certainly used in Google Cloud Vision[0], but I doubt this was a consideration when deciding if it was transformative/fair use.
0: https://cloud.google.com/vision
-
API that describes/labels images
You are looking for a computer vision object recognition api, and there are several, all of which cost money. Here are a couple Google’s Microsoft’s
-
Unique images help - right?
For an example, Google Cloud API which can read images even facial expression and much more and also can differentiate which Image is violating google terms and condition like - pronographic image
-
Just realized my algo is useless in live markets using heikin ashi
tool/software = https://cloud.google.com/vision
- MIERUKO CHAN CHAPTER 46
- [DISC] Mieruko-chan - Ch 46
- [DISC] Rosen Garten・Saga - Episode29「出番 〜Persona〜」 RAW
-
I didn’t lose a grey mid.. so is this fake or am I crazy and just can’t remember?
I think you’re underestimating how good OCR tech is. Yeah, shitty penmanship will be harder for software to interpret just as it is for the human eye, but legible ink isn’t much of a problem. I just tried scanning my number off my discs and my iPhone digitized it without issue. This technology is essentially made for digitizing handwritten text. Try it yourself if you’d like: https://cloud.google.com/vision
- GMB is Rejecting Everything for Us - Is This the New Normal or a Glitch?
-
Discussion - Raw japanese scans - Chapter 66 - Kemono Jihen / 怪物事変
you guys can use this website to extract the japanese texts from the manga pages : https://cloud.google.com/vision/ , then you can google-translate the texts .
automatic-video-processing
-
Building an API + query language for rich data like images and video
I'm one of the creators of Sieve, and I'm looking for general thoughts on this problem.
- I built the easiest way to process and tag videos with AI
-
The YC Winter 2022 Batch
https://sievedata.com seems very promising, a search engine for videos, with specific tags, sounds like a very good idea.
I'd like the same for all my photos and videos: that would be so much easier to find specific pictures by keywords
-
Locally vs cloud stored management systems
The reason I ask is because I'm working on something called Sieve. We're basically making it really easy for any software developer to process and understand video content. This includes applications from home security, to pet monitoring, baby monitoring, sports analytics, and media understanding.
-
AI video understanding in games
Hey everyone! I'm the creator of Sieve, an API for AI-based video understanding. One application we're starting to support in beta is tracking player / object movements, speed, etc in video games. All you do is push video to our API, which we then process, after which you can search + query using API calls. We're starting by supporting a few popular games like League of Legends, Dota 2, CSGO, and Overwatch. Here are the docs.
-
Gauging sentiment in sales calls?
For context I'm the founder of a company called Sieve which is starting to work with some of these tools to automatically gauge things like attentiveness and facial expressions by automatically analyzing the video. Would be interesting to hear what you as users actually want.
-
[D] How computer vision will take over the world
P.S. I am potentially very bias because I'm working on Sieve which is trying to work with these applications.
-
Smart features that are actually helpful?
Hey everyone! I recently started building Sieve, a really easy way for devs to understand video content. We've just started to work with quite a few video editing tools / companies (both online and offline ones) after having primarily focused on real-world applications like security, supply chain, and general media.
-
[P] Sieve: Process 24 hours of video in 10 mins (UPDATE - try it yourself!)
Hey everyone! I’m one of the creators of Sieve. I posted about it here a while back and thought I'd share that r/MachineLearning can now try it for free :)
-
Launch HN: Sieve (YC W22) – Pluggable APIs for Video Search
Hi HN, we’re Mokshith and Abhi from Sieve (https://sievedata.com). We’re building an API that lets you add video search to internal tools or customer applications, instantly. Sieve can process 24 hours of video in less than 10 minutes, and makes it easy to search video by detected objects / characteristics, motion data, and visual similarity. You can use our models out of the box, or plug-in your own model endpoints into our infrastructure. Models can mean any software that produces output given an image.
Every industry from security, to media, supply chain, construction, retail, sports, and agriculture is being transformed by video analytics—but setting up the infrastructure to process video data quickly is difficult. Having to deal with video ingestion pipelines, computer-vision model training, and search functionality is not pretty. We’re building a platform that takes care of all of this so teams can focus on their domain-expertise, building industry-specific software.
We met in high school, and were on the robotics team together. It was our first exposure to computer vision, and something we both deeply enjoyed. We ended up going to UC Berkeley together and worked on computer vision at places like Scale AI, Niantic, Ford, NVIDIA, Microsoft, and Second Spectrum. We were initially trying to solve problems for ourselves as computer vision developers but quickly realized the unique problems in video having to do with cost, efficiency, and scale. We also realized how important video would be in lots of verticals, and saw an opportunity to build infrastructure which wouldn’t have to be rebuilt by a fullstack dev at any company again.
Let’s take the example of cloud software for construction which might include tons of features from asset trackers to rental management and compliance checks. It doesn’t make sense for them to build their own video processing for telematics—the density and scale of video make this a difficult task. A single 30 FPS camera generates over 2.5M frames within a day of recording. Imagine this across thousands of cameras and many weeks of footage—not to mention the actual vertical-specific software they’re building for end users.
Sieve takes care of everything hard about processing and searching video. Our API allows you to process and search video with just two API calls. We use filtering, parallelization, and interpolation techniques to keep costs low, while being able to process 24 hours of video in under 10 minutes. Users can choose from our pre-existing set of models, or use their own models with our video processing engine. Our pricing can range anywhere from $0.08-$0.45 per minute of video processed based on the models clients are interested in and usage volume. Our FAQ page (https://sievedata.com/faq) explains these factors in more detail.
Our backend is built on serverless functions. We split each video into individual chunks which are processed in parallel and passed through multiple layers of filters to determine which chunks are “important”. We’re able to algorithmically ignore parts of video which are static, or change minimally, and focus on the parts that contain real action. We then run more expensive models on the most “important” parts of video, and interpolate results across frames to return information to customers at 30 FPS granularity. Our customers simply push signed video URLs to our platform, and this happens automatically. You can then use our API to query for intervals of interest.
We haven’t built an automated sign up flow yet because of our focus on the core product, but we still wanted to give all of you the chance to try Sieve on your own videos for free. You’ll be emailed a personal, limited-access API key.
Try it out: https://sieve-data.notion.site/Trying-Sieve-s-Video-Search-4...
Visual dashboard demo: https://www.youtube.com/watch?v=_uyjp_HGZl4
We’d love to hear what you think about the product and vision, and ideas on how we can improve it. Thanks for taking the time to read this, we’re grateful to be posting here :)
What are some alternatives?
tesseract-ocr - Tesseract Open Source OCR Engine (main repository)
rpi-object-detection - Real-time object detection and tracking using Raspberry Pi and OpenCV!
open_nsfw - Not Suitable for Work (NSFW) classification using deep neural network Caffe models.
CLIP - CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
streamlit - Streamlit — A faster way to build and share data apps.
forgefed - ForgeFed - Federation Protocol for Forge Services
google-cloud-ops-agents-ansible - Ansible Role for Google Cloud Ops
cloud-builders - Builder images and examples commonly used for Google Cloud Build
tfjs - A WebGL accelerated JavaScript library for training and deploying ML models.
gvisor - Application Kernel for Containers
examples - TensorFlow examples
esp-v2 - A service proxy that provides API management capabilities using Google Service Infrastructure.