AS-One is a python wrapper for multiple detection and tracking algorithms all in one place. Different trackers such as ByteTrack
, DeepSORT
or NorFair
can be integrated with different versions of YOLO
with minimum lines of code. This python wrapper provides YOLO models in ONNX
, PyTorch
& CoreML
flavors.
This is One Library for most of your computer vision needs.
Installation
For Linux
python3 -m venv .env
source .env/bin/activate
pip install numpy Cython
pip install cython-bbox asone onnxruntime-gpu==1.12.1 typing_extensions==4.4.0
pip install super-gradients==3.4.1
# for CPU
pip install torch torchvision
# for GPU
pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113
For Windows 10/11
python -m venv .env
.env\Scripts\activate
pip install numpy Cython
pip install lap
pip install -e git+https://github.com/samson-wang/cython_bbox.git#egg=cython-bbox
pip install asone onnxruntime-gpu==1.12.1
pip install typing_extensions==4.4.0
pip install super-gradients==3.4.1
# for CPU
pip install torch torchvision
# for GPU
pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113
or
pip install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio===0.10.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
Object Detection
ASOne simplifies the process of performing object detection by providing the option to utilize pre-trained weights. This streamlines the setup, allowing developers to swiftly begin detecting objects within images or videos. By default, ASOne downloads pre-trained weights, offering a quick start for object detection tasks.
import asone
from asone import utils
from asone import ASOne
import cv2
video_path = VIDEO_PATH
detector = ASOne(detector=asone.YOLOV7_PYTORCH, use_cuda=True) # Set use_cuda to False for cpu
filter_classes = None # Set to None to detect all classes
cap = cv2.VideoCapture(video_path)
while True:
_, frame = cap.read()
if not _:
break
dets, img_info = detector.detect(frame, filter_classes=filter_classes)
bbox_xyxy = dets[:, :4]
scores = dets[:, 4]
class_ids = dets[:, 5]
frame = utils.draw_boxes(frame, bbox_xyxy, class_ids=class_ids)
cv2.imshow('result', frame)
if cv2.waitKey(25) & 0xFF == ord('q'):
break
Furthermore, ASOne provides the option to filter specific classes during detection. This capability allows developers to focus on detecting only the required classes, enhancing efficiency in various applications.
filter_classes = ['person', 'car'] # Example: Detect only 'person' and 'car' classes
ASOne offers a plethora of models to choose from, catering to diverse requirements. Developers can effortlessly switch between models, including YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOX, YOLOR, YOLONAS, and their variants, adjusting to the project’s computational and accuracy needs.
# Change detector
detector = ASOne(detector=asone.YOLOX_S_PYTORCH, use_cuda=True)
detector = ASOne(detector=asone.YOLOV6N_PYTORCH, use_cuda=True)
# For macOs
# YOLO5
detector = ASOne(detector=asone.YOLOV5X_MLMODEL)
You can see all the models below.
Pytorch
YOLOV5 | YOLOV6 | YOLOV7 | YOLOV8 | YOLOR | YOLOX | YOLO-NAS |
YOLOV5X6_PYTORCH | YOLOV6N_PYTORCH | YOLOV7_TINY_PYTORCH | YOLOV8N_PYTORCH | YOLOR_CSP_X_PYTORCH | YOLOX_L_PYTORCH | YOLONAS_S_PYTORCH |
YOLOV5S_PYTORCH | YOLOV6T_PYTORCH | YOLOV7_PYTORCH | YOLOV8S_PYTORCH | YOLOR_CSP_X_STAR_PYTORCH | YOLOX_NANO_PYTORCH | YOLONAS_M_PYTORCH |
YOLOV5N_PYTORCH | YOLOV6S_PYTORCH | YOLOV7_X_PYTORCH | YOLOV8M_PYTORCH | YOLOR_CSP_STAR_PYTORCH | YOLOX_TINY_PYTORCH | YOLONAS_L_PYTORCH |
YOLOV5M_PYTORCH | YOLOV6M_PYTORCH | YOLOV7_W6_PYTORCH | YOLOV8L_PYTORCH | YOLOR_CSP_PYTORCH | YOLOX_DARKNET_PYTORCH | --- |
YOLOV5L_PYTORCH | YOLOV6L_PYTORCH | YOLOV7_E6_PYTORCH | YOLOV8X_PYTORCH | YOLOR_P6_PYTORCH | YOLOX_S_PYTORCH | --- |
YOLOV5X_PYTORCH | YOLOV6L_RELU_PYTORCH | YOLOV7_D6_PYTORCH | --- | --- | YOLOX_M_PYTORCH | --- |
YOLOV5N6_PYTORCH | YOLOV6S_REPOPT_PYTORCH | YOLOV7_E6E_PYTORCH | --- | --- | YOLOX_X_PYTORCH | --- |
YOLOV5S6_PYTORCH | --- | --- | --- | --- | --- | --- |
YOLOV5M6_PYTORCH | --- | --- | --- | --- | --- | --- |
YOLOV5L6_PYTORCH | --- | --- | --- | --- | --- | --- |
ONNX
YOLOV5 | YOLOV6 | YOLOV7 | YOLOV8 | YOLOR | YOLOX |
YOLOV5X6_ONNX | YOLOV6N_ONNX | YOLOV7_TINY_ONNX | YOLOV8N_ONNX | YOLOR_CSP_X_ONNX | YOLOX_L_ONNX |
YOLOV5S_ONNX | YOLOV6T_ONNX | YOLOV7_ONNX | YOLOV8S_ONNX | YOLOR_CSP_X_STAR_ONNX | YOLOX_NANO_ONNX |
YOLOV5N_ONNX | YOLOV6S_ONNX | YOLOV7_X_ONNX | YOLOV8M_ONNX | YOLOR_CSP_STAR_ONNX | YOLOX_TINY_ONNX |
YOLOV5M_ONNX | YOLOV6M_ONNX | YOLOV7_W6_ONNX | YOLOV8l_ONNX | YOLOR_CSP_ONNX | YOLOX_DARKNET_ONNX |
YOLOV5L_ONNX | YOLOV6L_ONNNX | YOLOV7_E6_ONNX | YOLOV8X_ONNX | YOLOR_P6_ONNX | YOLOX_S_ONNX |
YOLOV5X_ONNX | YOLOV6L_RELU_ONNX | YOLOV7_D6_ONNX | --- | --- | YOLOX_M_ONNX |
YOLOV5N6_ONNX | YOLOV6S_REPOPT_ONNX | YOLOV7_E6E_ONNX | --- | --- | YOLOX_X_ONNX |
YOLOV5S6_ONNX | --- | --- | --- | --- | --- |
YOLOV5M6_ONNX | --- | --- | --- | --- | --- |
YOLOV5L6_ONNX | --- | --- | --- | --- | --- |
COREML
YOLOV5 | YOLOV7 | YOLOV8 |
YOLOV5X6_MLMODEL | YOLOV7_TINY_MLMODEL | YOLOV8N_MLMODEL |
YOLOV5S_MLMODEL | YOLOV7_MLMODEL | YOLOV8S_MLMODEL |
YOLOV5N_MLMODEL | YOLOV7_X_MLMODEL | YOLOV8M_MLMODEL |
YOLOV5M_MLMODEL | YOLOV7_W6_MLMODEL | YOLOV8L_MLMODEL |
YOLOV5L_MLMODEL | YOLOV7_E6_MLMODEL | YOLOV8X_MLMODEL |
YOLOV5X_MLMODEL | YOLOV7_D6_MLMODEL | --- |
YOLOV5N6_MLMODEL | YOLOV7_E6E_MLMODEL | --- |
YOLOV5S6_MLMODEL | --- | --- |
YOLOV5M6_MLMODEL | --- | --- |
YOLOV5L6_MLMODEL | --- | --- |
Developers often encounter scenarios where pre-trained weights might not suffice. ASOne offers flexibility by enabling the use of custom weights. Use weights arguments to specify custom weights path.
import asone
from asone import utils
from asone import ASOne
import cv2
video_path = 'data/sample_videos/license_video.webm'
detector = ASOne(detector=asone.YOLOV7_PYTORCH, weights='data/custom_weights/yolov7_custom.pt', use_cuda=True) # Set use_cuda to False for cpu
class_names = ['license_plate'] # your custom classes list
cap = cv2.VideoCapture(video_path)
while True:
_, frame = cap.read()
if not _:
break
dets, img_info = detector.detect(frame)
bbox_xyxy = dets[:, :4]
scores = dets[:, 4]
class_ids = dets[:, 5]
frame = utils.draw_boxes(frame, bbox_xyxy, class_ids=class_ids, class_names=class_names) # simply pass custom classes list to write your classes on result video
cv2.imshow('result', frame)
if cv2.waitKey(25) & 0xFF == ord('q'):
break
Object Tracking
import asone
from asone import ASOne
# Instantiate Asone object
detect = ASOne(tracker=asone.BYTETRACK, detector=asone.YOLOV7_PYTORCH, use_cuda=True) #set use_cuda=False to use cpu
filter_classes = ['person'] # set to None to track all classes
# ##############################################
# To track using video file
# ##############################################
# Get tracking function
track = detect.track_video('data/sample_videos/test.mp4', output_dir='data/results', save_result=True, display=True, filter_classes=filter_classes)
# Loop over track to retrieve outputs of each frame
for bbox_details, frame_details in track:
bbox_xyxy, ids, scores, class_ids = bbox_details
frame, frame_num, fps = frame_details
# Do anything with bboxes here
# ##############################################
# To track using webcam
# ##############################################
# Get tracking function
track = detect.track_webcam(cam_id=0, output_dir='data/results', save_result=True, display=True, filter_classes=filter_classes)
# Loop over track to retrieve outputs of each frame
for bbox_details, frame_details in track:
bbox_xyxy, ids, scores, class_ids = bbox_details
frame, frame_num, fps = frame_details
# Do anything with bboxes here
# ##############################################
# To track using web stream
# ##############################################
# Get tracking function
stream_url = 'rtsp://wowzaec2demo.streamlock.net/vod/mp4:BigBuckBunny_115k.mp4'
track = detect.track_stream(stream_url, output_dir='data/results', save_result=True, display=True, filter_classes=filter_classes)
# Loop over track to retrieve outputs of each frame
for bbox_details, frame_details in track:
bbox_xyxy, ids, scores, class_ids = bbox_details
frame, frame_num, fps = frame_details
# Do anything with bboxes here
You can change tracker by just changing the tracker flag
detect = ASOne(tracker=asone.BYTETRACK, detector=asone.YOLOV7_PYTORCH, use_cuda=True)
# Change tracker
detect = ASOne(tracker=asone.DEEPSORT, detector=asone.YOLOV7_PYTORCH, use_cuda=True)
You can use any of the following trackers:
DEEPSORT
BYTETRACK
NORFAIR
MOTPY
STRONGSORT
OCSORT
Pose Estimation
ASOne offers robust pose estimation capabilities, enabling developers to accurately detect and analyze key points in images or videos. This functionality is instrumental in understanding human poses, tracking movements, and deriving valuable insights from visual data.
pose estimation on image
import asone
from asone import utils
from asone import PoseEstimator
import cv2
img_path = 'data/sample_imgs/test2.jpg'
pose_estimator = PoseEstimator(estimator_flag=asone.YOLOV8M_POSE, use_cuda=True) # Set use_cuda=False to use CPU
img = cv2.imread(img_path)
kpts = pose_estimator.estimate_image(img)
img = utils.draw_kpts(img, kpts)
cv2.imwrite("data/results/results.jpg", img)
pose estimation on video:
import asone
from asone import PoseEstimator
video_path = 'data/sample_videos/football1.mp4'
pose_estimator = PoseEstimator(estimator_flag=asone.YOLOV7_W6_POSE, use_cuda=True) # Set use_cuda=False to use CPU
estimator = pose_estimator.estimate_video(video_path, save=True, display=True)
for kpts, frame_details in estimator:
frame, frame_num, fps = frame_details
# Perform operations with kpts here
you can use any of the available models for pose estimation by just changing the flag
YOLOV8 | YOLOV7 |
YOLOV8N_POSE | YOLOV7_W6_POSE |
YOLOV8S_POSE | --- |
YOLOV8M_POSE | --- |
YOLOV8L_POSE | --- |
YOLOV8X_POSE | --- |
Text Detection
ASOne offers integrated text detection and recognition functionalities, streamlining the process of identifying and extracting text content from images.
ASOne simplifies text detection and recognition in images. Leveraging the CRAFT detector and EASYOCR recognizer, developers can easily identify and extract textual information.
import asone
from asone import utils
from asone import ASOne
import cv2
img_path = 'data/sample_imgs/sample_text.jpeg'
ocr = ASOne(detector=asone.CRAFT, recognizer=asone.EASYOCR, use_cuda=True) # Set use_cuda=False for CPU
img = cv2.imread(img_path)
results = ocr.detect_text(img)
img = utils.draw_text(img, results)
cv2.imwrite("data/results/results.jpg", img)
The code snippet above showcases ASOne's ability to process images, detect text regions, recognize the text content, and visualize the identified text on the image.
ASOne's text tracking capabilities extend to videos, allowing for the continuous monitoring and tracking of text across frames. This functionality aids in tracking text-related information in dynamic video sequences.
import asone
from asone import ASOne
# Instantiate ASOne object
detect = ASOne(tracker=asone.DEEPSORT, detector=asone.CRAFT, recognizer=asone.EASYOCR, use_cuda=True) # Set use_cuda=False for CPU
# Track text in a video file
track = detect.track_video('data/sample_videos/GTA_5-Unique_License_Plate.mp4', output_dir='data/results', save_result=True, display=True)
# Process tracking results for each frame
for bbox_details, frame_details in track:
bbox_xyxy, ids, scores, class_ids = bbox_details
frame, frame_num, fps = frame_details
# Perform operations with bounding boxes here
This code snippet demonstrates ASOne's capability to track text regions in videos using the DEEPSORT tracker, CRAFT detector, and EASYOCR recognizer, allowing developers to monitor and analyze text information across video frames.
Conclusion
In summation, ASOne emerges as a revolutionary toolkit, offering an integrated solution for intricate computer vision tasks encompassing object detection, pose estimation, tracking, and text recognition. Its hallmark lies in its adaptability: effortlessly toggling between detectors and trackers via simple flag adjustments. Moreover, with seamless support across Mac, Windows, and Linux systems, ASOne democratizes access to cutting-edge computer vision tools. Its intuitive design streamlines prediction processes, enabling developers to achieve remarkable results with minimal lines of code. ASOne stands as a beacon of accessibility and efficiency, promising a transformative future where computer vision reshapes industries, interactions, and innovations