This commit is contained in:
Mike 2026-01-03 18:02:26 +02:00
parent 6f1ce9d6f3
commit 46de3e918f
106 changed files with 29931 additions and 1 deletions

655
CHANGELOG.md Normal file
View File

@ -0,0 +1,655 @@
# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [2.1.0] - 2025-11-27
### Added
- **Per-Track Quality-Based CDM Selection**: Dynamic CDM switching during runtime DRM operations
- Enables quality-based CDM selection during runtime DRM switching
- Different CDMs can be used for different video quality levels within the same download session
- Example: Use Widevine L3 for SD/HD and PlayReady SL3 for 4K content
- **Enhanced Track Export**: Improved export functionality with additional metadata
- Added URL field to track export for easier identification
- Added descriptor information in export output
- Keys now exported in hex-formatted strings
### Changed
- **Dependencies**: Upgraded to latest compatible versions
- Updated various dependencies to their latest versions
### Fixed
- **Attachment Preservation**: Fixed attachments being dropped during track filtering
- Attachments (screenshots, fonts) were being lost when track list was rebuilt
- Fixes image files remaining in temp directory after muxing
- **DASH BaseURL Resolution**: Added AdaptationSet-level BaseURL support per DASH spec
- URL resolution chain now properly follows: MPD → Period → AdaptationSet → Representation
- **WindscribeVPN Region Support**: Restricted to supported regions with proper error handling
- Added error handling for unsupported regions in get_proxy method
- Prevents cryptic errors when using unsupported region codes
- **Filename Sanitization**: Fixed space-hyphen-space handling in filenames
- Pre-process space-hyphen-space patterns (e.g., "Title - Episode") before other replacements
- Made space-hyphen-space handling conditional on scene_naming setting
- Addresses PR #44 by fixing the root cause
- **CICP Enum Values**: Corrected values to match ITU-T H.273 specification
- Added Primaries.Unspecified (value 2) per H.273 spec
- Renamed Primaries/Transfer value 0 from Unspecified to Reserved for spec accuracy
- Simplified Transfer value 2 from Unspecified_Image to Unspecified
- Verified against ITU-T H.273, ISO/IEC 23091-2, H.264/H.265 specs, and FFmpeg enums
- **HLS Byte Range Parsing**: Fixed TypeError in range_offset conversion
- Converted range_offset to int to prevent TypeError in calculate_byte_range
- **pyplayready Compatibility**: Pinned to <0.7 to avoid KID extraction bug
## [2.0.0] - 2025-11-10
### Breaking Changes
- **REST API Integration**: Core architecture modified to support REST API functionality
- Changes to internal APIs for download management and tracking
- Title and track classes updated with API integration points
- Core component interfaces modified for queue management support
- **Configuration Changes**: New required configuration options for API and enhanced features
- Added `simkl_client_id` now required for Simkl functionality
- Service-specific configuration override structure introduced
- Debug logging configuration options added
- **Forced Subtitles**: Behavior change for forced subtitle inclusion
- Forced subs no longer auto-included, requires explicit `--forced-subs` flag
### Added
- **REST API Server**: Complete download management via REST API (early development)
- Implemented download queue management and worker system
- Added OpenAPI/Swagger documentation for easy API exploration
- Included download progress tracking and status endpoints
- API authentication and comprehensive error handling
- Updated core components to support API integration
- Early development work with more changes planned
- **CustomRemoteCDM**: Highly configurable custom CDM API support
- Support for third-party CDM API providers with maximum configurability
- Full configuration through YAML without code changes
- Addresses GitHub issue #26 for flexible CDM integration
- **WindscribeVPN Proxy Provider**: New VPN provider support
- Added WindscribeVPN following NordVPN and SurfsharkVPN patterns
- Fixes GitHub issue #29
- **Latest Episode Download**: New `--latest-episode` CLI option
- `-le, --latest-episode` flag to download only the most recent episode
- Automatically selects the single most recent episode regardless of season
- Fixes GitHub issue #28
- **Video Track Exclusion**: New `--no-video` CLI option
- `-nv, --no-video` flag to skip downloading video tracks
- Allows downloading only audio, subtitles, attachments, and chapters
- Useful for audio-only or subtitle extraction workflows
- Fixes GitHub issue #39
- **Service-Specific Configuration Overrides**: Per-service fine-tuned control
- Support for per-service configuration overrides in YAML
- Fine-tuned control of downloader and command options per service
- Fixes GitHub issue #13
- **Comprehensive JSON Debug Logging**: Structured logging for troubleshooting
- Binary toggle via `--debug` flag or `debug: true` in config
- JSON Lines (.jsonl) format for easy parsing and analysis
- Comprehensive logging of all operations (session info, CLI params, CDM details, auth status, title/track metadata, DRM operations, vault queries)
- Configurable key logging via `debug_keys` option with smart redaction
- Error logging for all critical operations
- Removed old text logging system
- **curl_cffi Retry Handler**: Enhanced session reliability
- Added automatic retry mechanism to curl_cffi Session
- Improved download reliability with configurable retries
- **Simkl API Configuration**: New API key support
- Added `simkl_client_id` configuration option
- Simkl now requires client_id from https://simkl.com/settings/developer/
- **Custom Session Fingerprints**: Enhanced browser impersonation capabilities
- Added custom fingerprint and preset support for better service compatibility
- Configurable fingerprint presets for different device types
- Improved success rate with services using advanced bot detection
- **TMDB and Simkl Metadata Caching**: Enhanced title cache system
- Added metadata caching to title cache to reduce API calls
- Caches movie/show metadata alongside title information
- Improves performance for repeated title lookups and reduces API rate limiting
- **API Enhancements**: Improved REST API functionality
- Added default parameter handling for better request processing
- Added URL field to services endpoint response for easier service identification
- Complete API enhancements for production readiness
- Improved error responses with better detail and debugging information
### Changed
- **Binary Search Enhancement**: Improved binary discovery
- Refactored to search for binaries in root of binary folder or subfolder named after the binary
- Better organization of binary dependencies
- **Type Annotations**: Modernized to PEP 604 syntax
- Updated session.py type annotations to use modern Python syntax
- Improved code readability and type checking
### Fixed
- **Audio Description Track Support**: Added option to download audio description tracks
- Added `--audio-description/-ad` flag to optionally include descriptive audio tracks
- Previously, audio description tracks were always filtered out
- Users can now choose to download AD tracks when needed
- Fixes GitHub issue #33
- **Config Directory Support**: Cross-platform user config directory support
- Fixed config loading to properly support user config directories across all platforms
- Fixes GitHub issue #23
- **HYBRID Mode Validation**: Pre-download validation for hybrid processing
- Added validation to check both HDR10 and DV tracks are available before download
- Prevents wasted downloads when hybrid processing would fail
- **TMDB/Simkl API Keys**: Graceful handling of missing API keys
- Improved error handling when TMDB or Simkl API keys are not configured
- Better user messaging for API configuration requirements
- **Forced Subtitles Behavior**: Correct forced subtitle filtering
- Fixed forced subtitles being incorrectly included without `--forced-subs` flag
- Forced subs now only included when explicitly requested
- **Font Attachment Constructor**: Fixed ASS/SSA font attachment
- Use keyword arguments for Attachment constructor in font attachment
- Fixes "Invalid URL: No scheme supplied" error
- Fixes GitHub issue #24
- **Binary Subdirectory Checking**: Enhanced binary location discovery (by @TPD94, PR #19)
- Updated binaries.py to check subdirectories in binaries folders named after the binary
- Improved binary detection and loading
- **HLS Manifest Processing**: Minor HLS parser fix (by @TPD94, PR #19)
- **lxml and pyplayready**: Updated dependencies (by @Sp5rky)
- Updated lxml constraint and pyplayready import path for compatibility
- **DASH Segment Calculation**: Corrected segment handling
- Fixed segment count calculation for DASH manifests with startNumber=0
- Ensures accurate segment processing for all DASH manifest configurations
- Prevents off-by-one errors in segment downloads
- **HDR Detection and Naming**: Comprehensive HDR format support
- Improved HDR detection with comprehensive transfer characteristics checks
- Added hybrid DV+HDR10 support for accurate file naming
- Better identification of HDR formats across different streaming services
- More accurate HDR/DV detection in filename generation
- **Subtitle Processing**: VTT subtitle handling improvements
- Resolved SDH (Subtitles for Deaf and Hard of hearing) stripping crash when processing VTT files
- More robust subtitle processing pipeline with better error handling
- Fixes crashes when filtering specific VTT subtitle formats
- **DRM Processing**: Enhanced encoding handling
- Added explicit UTF-8 encoding to mp4decrypt subprocess calls
- Prevents encoding issues on systems with non-UTF-8 default encodings
- Improves cross-platform compatibility for Windows and some Linux configurations
- **Session Fingerprints**: Updated OkHttp presets
- Updated OkHttp fingerprint presets for better Android TV compatibility
- Improved success rate with services using fingerprint-based detection
### Documentation
- **GitHub Issue Templates**: Enhanced issue reporting
- Improved bug report template with better structure and required fields
- Enhanced feature request template for clearer specifications
- Added helpful guidance for contributors to provide complete information
### Refactored
- **Import Cleanup**: Removed unused imports
- Removed unused mypy import from binaries.py
- Fixed import ordering in API download_manager and handlers
### Contributors
This release includes contributions from:
- @Sp5rky - REST API server implementation, dependency updates
- @stabbedbybrick - curl_cffi retry handler (PR #31)
- @stabbedbybrick - n_m3u8dl-re refactor (PR #38)
- @TPD94 - Binary search enhancements, manifest parser fixes (PR #19)
- @scene (Andy) - Core features, configuration system, bug fixes
## [1.4.8] - 2025-10-08
### Added
- **Exact Language Matching**: New `--exact-lang` flag for precise language matching
- Enables strict language code matching without fallbacks
- **No-Mux Flag**: New `--no-mux` flag to skip muxing tracks into container files
- Useful for keeping individual track files separate
- **DecryptLabs API Integration for HTTP Vault**: Enhanced vault support
- Added DecryptLabs API support to HTTP vault for improved key retrieval
- **AC4 Audio Codec Support**: Enhanced audio format handling
- Added AC4 codec support in Audio class with updated mime/profile handling
- **pysubs2 Subtitle Conversion**: Extended subtitle format support
- Added pysubs2 subtitle conversion with extended format support
- Configurable conversion method in configuration
### Changed
- **Audio Track Sorting**: Optimized audio track selection logic
- Improved audio track sorting by grouping descriptive tracks and sorting by bitrate
- Better identification of ATMOS and DD+ as highest quality for filenaming
- **pyplayready Update**: Upgraded to version 0.6.3
- Updated import paths to resolve compatibility issues
- Fixed lxml constraints for better dependency management
- **pysubs2 Conversion Method**: Moved from auto to manual configuration
- pysubs2 no longer auto-selected during testing phase
### Fixed
- **Remote CDM**: Fixed curl_cffi compatibility
- Added curl_cffi to instance checks in RemoteCDM
- **Temporary File Handling**: Improved encoding handling
- Specified UTF-8 encoding when opening temporary files
### Reverted
- **tinycss SyntaxWarning Suppression**: Removed ineffective warning filter
- Reverted warnings filter that didn't work as expected for suppressing tinycss warnings
## [1.4.7] - 2025-09-25
### Added
- **curl_cffi Session Support**: Enhanced anti-bot protection with browser impersonation
- Added new session utility with curl_cffi support for bypassing anti-bot measures
- Browser impersonation support for Chrome, Firefox, and Safari user agents
- Full backward compatibility with requests.Session maintained
- Suppressed HTTPS proxy warnings for improved user experience
- **Download Retry Functionality**: Configurable retry mechanism for failed downloads
- Added retry count option to download function for improved reliability
- **Subtitle Requirements Options**: Enhanced subtitle download control
- Added options for required subtitles in download command
- Better control over subtitle track selection and requirements
- **Quality Selection Enhancement**: Improved quality selection options
- Added best available quality option in download command for optimal track selection
- **DecryptLabs API Integration**: Enhanced remote CDM configuration
- Added decrypt_labs_api_key to Config initialization for better API integration
### Changed
- **Manifest Parser Updates**: Enhanced compatibility across all parsers
- Updated DASH, HLS, ISM, and M3U8 parsers to accept curl_cffi sessions
- Improved cookie handling compatibility between requests and curl_cffi
- **Logging Improvements**: Reduced log verbosity for better user experience
- Changed duplicate track log level to debug to reduce console noise
- Dynamic CDM selection messages moved to debug-only output
### Fixed
- **Remote CDM Reuse**: Fixed KeyError in dynamic CDM selection
- Prevents KeyError when reusing remote CDMs in dynamic selection process
- Creates copy of CDM dictionary before modification to prevent configuration mutation
- Allows same CDM to be selected multiple times within session without errors
## [1.4.6] - 2025-09-13
### Added
- **Quality-Based CDM Selection**: Dynamic CDM selection based on video resolution
- Automatically selects appropriate CDM (L3/L1) based on video track quality
- Supports quality thresholds in configuration (>=, >, <=, <, exact match)
- Pre-selects optimal CDM based on highest quality across all video tracks
- Maintains backward compatibility with existing CDM configurations
- **Automatic Audio Language Metadata**: Intelligent embedded audio language detection
- Automatically sets audio language metadata when no separate audio tracks exist
- Smart video track selection based on title language with fallbacks
- Enhanced FFmpeg repackaging with audio stream metadata injection
- **Lazy DRM Loading**: Deferred DRM loading for multi-track key retrieval optimization
- Add deferred DRM loading to M3U8 parser to mark tracks for later processing
- Just-in-time DRM loading during download process for better performance
### Changed
- **Enhanced CDM Management**: Improved CDM switching logic for multi-quality downloads
- CDM selection now based on highest quality track to avoid inefficient switching
- Quality-based selection only within same DRM type (Widevine-to-Widevine, PlayReady-to-PlayReady)
- Single CDM used per session for better performance and reliability
### Fixed
- **Vault Caching Issues**: Fixed vault count display and NoneType iteration errors
- Fix 'NoneType' object is not iterable error in DecryptLabsRemoteCDM
- Fix vault count display showing 0/3 instead of actual successful vault count
- **Service Name Transmission**: Resolved DecryptLabsRemoteCDM service name issues
- Fixed DecryptLabsRemoteCDM sending 'generic' instead of proper service names
- Added case-insensitive vault lookups for SQLite/MySQL vaults
- Added local vault integration to DecryptLabsRemoteCDM
- **Import Organization**: Improved import ordering and code formatting
- Reorder imports in decrypt_labs_remote_cdm.py for better organization
- Clean up trailing whitespace in vault files
### Configuration
- **New CDM Configuration Format**: Extended `cdm:` section supports quality-based selection
```yaml
cdm:
SERVICE_NAME:
"<=1080": l3_cdm_name
">1080": l1_cdm_name
default: l3_cdm_name
```
## [1.4.5] - 2025-09-09
### Added
- **Enhanced CDM Key Caching**: Improved key caching and session management for L1/L2 devices
- Optimized `get_cached_keys_if_exists` functionality for better performance with L1/L2 devices
- Enhanced cached key retrieval logic with improved session handling
- **Widevine Common Certificate Fallback**: Added fallback to Widevine common certificate for L1 devices
- Improved compatibility for L1 devices when service certificates are unavailable
- **Enhanced Vault Loading**: Improved vault loading and key copying logic
- Better error handling and key management in vault operations
- **PSSH Display Optimization**: Truncated PSSH string display in non-debug mode for cleaner output
- **CDM Error Messaging**: Added error messages for missing service certificates in CDM sessions
### Changed
- **Dynamic Version Headers**: Updated User-Agent headers to use dynamic version strings
- DecryptLabsRemoteCDM now uses dynamic version import instead of hardcoded version
- **Intelligent CDM Caching**: Implemented intelligent caching system for CDM license requests
- Enhanced caching logic reduces redundant license requests and improves performance
- **Enhanced Tag Handling**: Improved tag handling for TV shows and movies from Simkl data
- Better metadata processing and formatting for improved media tagging
### Fixed
- **CDM Session Management**: Clean up session data when retrieving cached keys
- Remove decrypt_labs_session_id and challenge from session when cached keys exist but there are missing kids
- Ensures clean state for subsequent requests and prevents session conflicts
- **Tag Formatting**: Fixed formatting issues in tag processing
- **Import Order**: Fixed import order issues in tags module
## [1.4.4] - 2025-09-02
### Added
- **Enhanced DecryptLabs CDM Support**: Comprehensive remote CDM functionality
- Full support for Widevine, PlayReady, and ChromeCDM through DecryptLabsRemoteCDM
- Enhanced session management and caching support for remote WV/PR operations
- Support for cached keys and improved license handling
- New CDM configurations for Chrome and PlayReady devices with updated User-Agent and service certificate
- **Advanced Configuration Options**: New device and language preferences
- Added configuration options for device certificate status list
- Enhanced language preference settings
### Changed
- **DRM Decryption Enhancements**: Streamlined decryption process
- Simplified decrypt method by removing unused parameter and streamlined logic
- Improved DecryptLabs CDM configurations with better device support
### Fixed
- **Matroska Tag Compliance**: Enhanced media container compatibility
- Fixed Matroska tag compliance with official specification
- **Application Branding**: Cleaned up version display
- Removed old devine version reference from banner to avoid developer confusion
- Updated branding while maintaining original GNU license compliance
- **IP Information Handling**: Improved geolocation services
- Enhanced get_ip_info functionality with better failover handling
- Added support for 429 error handling and multiple API provider fallback
- Implemented cached IP info retrieval with fallback tester to avoid rate limiting
- **Dependencies**: Streamlined package requirements
- Removed unnecessary data extra requirement from langcodes
### Removed
- Deprecated version references in application banner for clarity
## [1.4.3] - 2025-08-20
### Added
- Cached IP info helper for region detection
- New `get_cached_ip_info()` with 24h cache and provider rotation (ipinfo/ipapi) with 429 handling.
- Reduces external calls and stabilizes non-proxy region lookups for caching/logging.
### Changed
- DRM decryption selection is fully configuration-driven
- Widevine and PlayReady now select the decrypter based solely on `decryption` in YAML (including per-service mapping).
- Shaka Packager remains the default decrypter when not specified.
- `dl.py` logs the chosen tool based on the resolved configuration.
- Geofencing and proxy verification improvements
- Safer geofence checks with error handling and clearer logs.
- Always verify proxy exit region via live IP lookup; fallback to proxy parsing on failure.
- Example config updated to default to Shaka
- `unshackle.yaml`/example now sets `decryption.default: shaka` (service overrides still supported).
### Removed
- Deprecated parameter `use_mp4decrypt`
- Removed from `Widevine.decrypt()` and `PlayReady.decrypt()` and all callsites.
- Internal naming switched from mp4decrypt-specific flags to generic `decrypter` selection.
## [1.4.2] - 2025-08-14
### Added
- **Session Management for API Requests**: Enhanced API reliability with retry logic
- Implemented session management for tags functionality with automatic retry mechanisms
- Improved API request stability and error handling
- **Series Year Configuration**: New `series_year` option for title naming control
- Added configurable `series_year` option to control year inclusion in series titles
- Enhanced YAML configuration with series year handling options
- **Audio Language Override**: New audio language selection option
- Added `audio_language` option to override default language selection for audio tracks
- Provides more granular control over audio track selection
- **Vault Key Reception Control**: Enhanced vault security options
- Added `no_push` option to Vault and its subclasses to control key reception
- Improved key management security and flexibility
### Changed
- **HLS Segment Processing**: Enhanced segment retrieval and merging capabilities
- Enhanced segment retrieval to allow all file types for better compatibility
- Improved segment merging with recursive file search and fallback to binary concatenation
- Fixed issues with VTT files from HLS not being found correctly due to format changes
- Added cleanup of empty segment directories after processing
- **Documentation**: Updated README.md with latest information
### Fixed
- **Audio Track Selection**: Improved per-language logic for audio tracks
- Adjusted `per_language` logic to ensure correct audio track selection
- Fixed issue where all tracks for selected language were being downloaded instead of just the intended ones
## [1.4.1] - 2025-08-08
### Added
- **Title Caching System**: Intelligent title caching to reduce redundant API calls
- Configurable title caching with 30-minute default cache duration
- 24-hour fallback cache on API failures for improved reliability
- Region-aware caching to handle geo-restricted content properly
- SHA256 hashing for cache keys to handle complex title IDs
- Added `--no-cache` CLI flag to bypass caching when needed
- Added `--reset-cache` CLI flag to clear existing cache data
- New cache configuration variables in config system
- Documented caching options in example configuration file
- Significantly improves performance when debugging or modifying CLI parameters
- **Enhanced Tagging Configuration**: New options for customizing tag behavior
- Added `tag_group_name` config option to control group name inclusion in tags
- Added `tag_imdb_tmdb` config option to control IMDB/TMDB details in tags
- Added Simkl API endpoint support as fallback when no TMDB API key is provided
- Enhanced tag_file function to prioritize provided TMDB ID when `--tmdb` flag is used
- Improved TMDB ID handling with better prioritization logic
### Changed
- **Language Selection Enhancement**: Improved default language handling
- Updated language option default to 'orig' when no `-l` flag is set
- Avoids hardcoded 'en' default and respects original content language
- **Tagging Logic Improvements**: Simplified and enhanced tagging functionality
- Simplified Simkl search logic with soft-fail when no results found
- Enhanced tag_file function with better TMDB ID prioritization
- Improved error handling in tagging operations
### Fixed
- **Subtitle Processing**: Enhanced subtitle filtering for edge cases
- Fixed ValueError in subtitle filtering for multiple colons in time references
- Improved handling of subtitles containing complex time formatting
- Better error handling for malformed subtitle timestamps
### Removed
- **Docker Support**: Removed Docker configuration from repository
- Removed Dockerfile and .dockerignore files
- Cleaned up README.md Docker-related documentation
- Focuses on direct installation methods
## [1.4.0] - 2025-08-05
### Added
- **HLG Transfer Characteristics Preservation**: Enhanced video muxing to preserve HLG color metadata
- Added automatic detection of HLG video tracks during muxing process
- Implemented `--color-transfer-characteristics 0:18` argument for mkvmerge when processing HLG content
- Prevents incorrect conversion from HLG (18) to BT.2020 (14) transfer characteristics
- Ensures proper HLG playback support on compatible hardware without manual editing
- **Original Language Support**: Enhanced language selection with 'orig' keyword support
- Added support for 'orig' language selector for both video and audio tracks
- Automatically detects and uses the title's original language when 'orig' is specified
- Improved language processing logic with better duplicate handling
- Enhanced help text to document original language selection usage
- **Forced Subtitle Support**: Added option to include forced subtitle tracks
- New functionality to download and include forced subtitle tracks alongside regular subtitles
- **WebVTT Subtitle Filtering**: Enhanced subtitle processing capabilities
- Added filtering for unwanted cues in WebVTT subtitles
- Improved subtitle quality by removing unnecessary metadata
### Changed
- **DRM Track Decryption**: Improved DRM decryption track selection logic
- Enhanced `get_drm_for_cdm()` method usage for better DRM-CDM matching
- Added warning messages when no matching DRM is found for tracks
- Improved error handling and logging for DRM decryption failures
- **Series Tree Representation**: Enhanced episode tree display formatting
- Updated series tree to show season breakdown with episode counts
- Improved visual representation with "S{season}({count})" format
- Better organization of series information in console output
- **Hybrid Processing UI**: Enhanced extraction and conversion processes
- Added dynamic spinning bars to follow the rest of the codebase design
- Improved visual feedback during hybrid HDR processing operations
- **Track Selection Logic**: Enhanced multi-track selection capabilities
- Fixed track selection to support combining -V, -A, -S flags properly
- Improved flexibility in selecting multiple track types simultaneously
- **Service Subtitle Support**: Added configuration for services without subtitle support
- Services can now indicate if they don't support subtitle downloads
- Prevents unnecessary subtitle download attempts for unsupported services
- **Update Checker**: Enhanced update checking logic and cache handling
- Improved rate limiting and caching mechanisms for update checks
- Better performance and reduced API calls to GitHub
### Fixed
- **PlayReady KID Extraction**: Enhanced KID extraction from PSSH data
- Added base64 support and XML parsing for better KID detection
- Fixed issue where only one KID was being extracted for certain services
- Improved multi-KID support for PlayReady protected content
- **Dolby Vision Detection**: Improved DV codec detection across all formats
- Fixed detection of dvhe.05.06 codec which was not being recognized correctly
- Enhanced detection logic in Episode and Movie title classes
- Better support for various Dolby Vision codec variants
## [1.3.0] - 2025-08-03
### Added
- **mp4decrypt Support**: Alternative DRM decryption method using mp4decrypt from Bento4
- Added `mp4decrypt` binary detection and support in binaries module
- New `decryption` configuration option in unshackle.yaml for service-specific decryption methods
- Enhanced PlayReady and Widevine DRM classes with mp4decrypt decryption support
- Service-specific decryption mapping allows choosing between `shaka` and `mp4decrypt` per service
- Improved error handling and progress reporting for mp4decrypt operations
- **Scene Naming Configuration**: New `scene_naming` option for controlling file naming conventions
- Added scene naming logic to movie, episode, and song title classes
- Configurable through unshackle.yaml to enable/disable scene naming standards
- **Terminal Cleanup and Signal Handling**: Enhanced console management
- Implemented proper terminal cleanup on application exit
- Added signal handling for graceful shutdown in ComfyConsole
- **Configuration Template**: New `unshackle-example.yaml` template file
- Replaced main `unshackle.yaml` with example template to prevent git conflicts
- Users can now modify their local config without affecting repository updates
- **Enhanced Credential Management**: Improved CDM and vault configuration
- Expanded credential management documentation in configuration
- Enhanced CDM configuration examples and guidelines
- **Video Transfer Standards**: Added `Unspecified_Image` option to Transfer enum
- Implements ITU-T H.Sup19 standard value 2 for image characteristics
- Supports still image coding systems and unknown transfer characteristics
- **Update Check Rate Limiting**: Enhanced update checking system
- Added configurable update check intervals to prevent excessive API calls
- Improved rate limiting for GitHub API requests
### Changed
- **DRM Decryption Architecture**: Enhanced decryption system with dual method support
- Updated `dl.py` to handle service-specific decryption method selection
- Refactored `Config` class to manage decryption method mapping per service
- Enhanced DRM decrypt methods with `use_mp4decrypt` parameter for method selection
- **Error Handling**: Improved exception handling in Hybrid class
- Replaced log.exit calls with ValueError exceptions for better error propagation
- Enhanced error handling consistency across hybrid processing
### Fixed
- **Proxy Configuration**: Fixed proxy server mapping in configuration
- Renamed 'servers' to 'server_map' in proxy configuration to resolve Nord/Surfshark naming conflicts
- Updated configuration structure for better compatibility with proxy providers
- **HTTP Vault**: Improved URL handling and key retrieval logic
- Fixed URL processing issues in HTTP-based key vaults
- Enhanced key retrieval reliability and error handling
## [1.2.0] - 2025-07-30
### Added
- **Update Checker**: Automatic GitHub release version checking on startup
- Configurable update notifications via `update_checks` setting in unshackle.yaml
- Non-blocking HTTP requests with 5-second timeout for performance
- Smart semantic version comparison supporting all version formats (x.y.z, x.y, x)
- Graceful error handling for network issues and API failures
- User-friendly update notifications with current → latest version display
- Direct links to GitHub releases page for easy updates
- **HDR10+ Support**: Enhanced HDR10+ metadata processing for hybrid tracks
- HDR10+ tool binary support (`hdr10plus_tool`) added to binaries module
- HDR10+ to Dolby Vision conversion capabilities in hybrid processing
- Enhanced metadata extraction for HDR10+ content
- **Duration Fix Handling**: Added duration correction for video and hybrid tracks
- **Temporary Directory Management**: Automatic creation of temp directories for attachment downloads
### Changed
- Enhanced configuration system with new `update_checks` boolean option (defaults to true)
- Updated sample unshackle.yaml with update checker configuration documentation
- Improved console styling consistency using `bright_black` for dimmed text
- **Environment Dependency Check**: Complete overhaul with detailed categorization and status summary
- Organized dependencies by category (Core, HDR, Download, Subtitle, Player, Network)
- Enhanced status reporting with compact summary display
- Improved tool requirement tracking and missing dependency alerts
- **Hybrid Track Processing**: Significant improvements to HDR10+ and Dolby Vision handling
- Enhanced metadata extraction and processing workflows
- Better integration with HDR processing tools
### Removed
- **Docker Workflow**: Removed Docker build and publish GitHub Actions workflow for manual builds
## [1.1.0] - 2025-07-29
### Added
- **HDR10+DV Hybrid Processing**: New `-r HYBRID` command for processing HDR10 and Dolby Vision tracks
- Support for hybrid HDR processing and injection using dovi_tool
- New hybrid track processing module for seamless HDR10/DV conversion
- Automatic detection and handling of HDR10 and DV metadata
- Support for HDR10 and DV tracks in hybrid mode for EXAMPLE service
- Binary availability check for dovi_tool in hybrid mode operations
- Enhanced track processing capabilities for HDR content
### Fixed
- Import order issues and missing json import in hybrid processing
- UV installation process and error handling improvements
- Binary search functionality updated to use `binaries.find`
### Changed
- Updated package version from 1.0.2 to 1.1.0
- Enhanced dl.py command processing for hybrid mode support
- Improved core titles (episode/movie) processing for HDR content
- Extended tracks module with hybrid processing capabilities

684
CONFIG.md Normal file
View File

@ -0,0 +1,684 @@
# Config Documentation
This page documents configuration values and what they do. You begin with an empty configuration file.
You may alter your configuration with `unshackle cfg --help`, or find the direct location with `unshackle env info`.
Configuration values are listed in alphabetical order.
Avoid putting comments in the config file as they may be removed. Comments are currently kept only thanks
to the usage of `ruamel.yaml` to parse and write YAML files. In the future `yaml` may be used instead,
which does not keep comments.
## aria2c (dict)
- `max_concurrent_downloads`
Maximum number of parallel downloads. Default: `min(32,(cpu_count+4))`
Note: Overrides the `max_workers` parameter of the aria2(c) downloader function.
- `max_connection_per_server`
Maximum number of connections to one server for each download. Default: `1`
- `split`
Split a file into N chunks and download each chunk on its own connection. Default: `5`
- `file_allocation`
Specify file allocation method. Default: `"prealloc"`
- `"none"` doesn't pre-allocate file space.
- `"prealloc"` pre-allocates file space before download begins. This may take some time depending on the size of the
file.
- `"falloc"` is your best choice if you are using newer file systems such as ext4 (with extents support), btrfs, xfs
or NTFS (MinGW build only). It allocates large(few GiB) files almost instantly. Don't use falloc with legacy file
systems such as ext3 and FAT32 because it takes almost same time as prealloc, and it blocks aria2 entirely until
allocation finishes. falloc may not be available if your system doesn't have posix_fallocate(3) function.
- `"trunc"` uses ftruncate(2) system call or platform-specific counterpart to truncate a file to a specified length.
## cdm (dict)
Pre-define which Widevine or PlayReady device to use for each Service by Service Tag as Key (case-sensitive).
The value should be a WVD or PRD filename without the file extension. When
loading the device, unshackle will look in both the `WVDs` and `PRDs` directories
for a matching file.
For example,
```yaml
AMZN: chromecdm_903_l3
NF: nexus_6_l1
```
You may also specify this device based on the profile used.
For example,
```yaml
AMZN: chromecdm_903_l3
NF: nexus_6_l1
DSNP:
john_sd: chromecdm_903_l3
jane_uhd: nexus_5_l1
```
You can also specify a fallback value to predefine if a match was not made.
This can be done using `default` key. This can help reduce redundancy in your specifications.
For example, the following has the same result as the previous example, as well as all other
services and profiles being pre-defined to use `chromecdm_903_l3`.
```yaml
NF: nexus_6_l1
DSNP:
jane_uhd: nexus_5_l1
default: chromecdm_903_l3
```
## chapter_fallback_name (str)
The Chapter Name to use when exporting a Chapter without a Name.
The default is no fallback name at all and no Chapter name will be set.
The fallback name can use the following variables in f-string style:
- `{i}`: The Chapter number starting at 1.
E.g., `"Chapter {i}"`: "Chapter 1", "Intro", "Chapter 3".
- `{j}`: A number starting at 1 that increments any time a Chapter has no title.
E.g., `"Chapter {j}"`: "Chapter 1", "Intro", "Chapter 2".
These are formatted with f-strings, directives are supported.
For example, `"Chapter {i:02}"` will result in `"Chapter 01"`.
## credentials (dict[str, str|list|dict])
Specify login credentials to use for each Service, and optionally per-profile.
For example,
```yaml
ALL4: jane@gmail.com:LoremIpsum100 # directly
AMZN: # or per-profile, optionally with a default
default: jane@example.tld:LoremIpsum99 # <-- used by default if -p/--profile is not used
james: james@gmail.com:TheFriend97
john: john@example.tld:LoremIpsum98
NF: # the `default` key is not necessary, but no credential will be used by default
john: john@gmail.com:TheGuyWhoPaysForTheNetflix69420
```
The value should be in string form, i.e. `john@gmail.com:password123` or `john:password123`.
Any arbitrary values can be used on the left (username/password/phone) and right (password/secret).
You can also specify these in list form, i.e., `["john@gmail.com", ":PasswordWithAColon"]`.
If you specify multiple credentials with keys like the `AMZN` and `NF` example above, then you should
use a `default` key or no credential will be loaded automatically unless you use `-p/--profile`. You
do not have to use a `default` key at all.
Please be aware that this information is sensitive and to keep it safe. Do not share your config.
## curl_impersonate (dict)
- `browser` - The Browser to impersonate as. A list of available Browsers and Versions are listed here:
<https://github.com/yifeikong/curl_cffi#sessions>
Default: `"chrome124"`
For example,
```yaml
curl_impersonate:
browser: "chrome120"
```
## directories (dict)
Override the default directories used across unshackle.
The directories are set to common values by default.
The following directories are available and may be overridden,
- `commands` - CLI Command Classes.
- `services` - Service Classes.
- `vaults` - Vault Classes.
- `fonts` - Font files (ttf or otf).
- `downloads` - Downloads.
- `temp` - Temporary files or conversions during download.
- `cache` - Expiring data like Authorization tokens, or other misc data.
- `cookies` - Expiring Cookie data.
- `logs` - Logs.
- `wvds` - Widevine Devices.
- `prds` - PlayReady Devices.
- `dcsl` - Device Certificate Status List.
Notes:
- `services` accepts either a single directory or a list of directories to search for service modules.
For example,
```yaml
downloads: "D:/Downloads/unshackle"
temp: "D:/Temp/unshackle"
```
There are directories not listed that cannot be modified as they are crucial to the operation of unshackle.
## dl (dict)
Pre-define default options and switches of the `dl` command.
The values will be ignored if explicitly set in the CLI call.
The Key must be the same value Python click would resolve it to as an argument.
E.g., `@click.option("-r", "--range", "range_", type=...` actually resolves as `range_` variable.
For example to set the default primary language to download to German,
```yaml
lang: de
```
You can also set multiple preferred languages using a list, e.g.,
```yaml
lang:
- en
- fr
```
to set how many tracks to download concurrently to 4 and download threads to 16,
```yaml
downloads: 4
workers: 16
```
to set `--bitrate=CVBR` for the AMZN service,
```yaml
lang: de
AMZN:
bitrate: CVBR
```
or to change the output subtitle format from the default (original format) to WebVTT,
```yaml
sub_format: vtt
```
## downloader (str | dict)
Choose what software to use to download data throughout unshackle where needed.
You may provide a single downloader globally or a mapping of service tags to
downloaders.
Options:
- `requests` (default) - <https://github.com/psf/requests>
- `aria2c` - <https://github.com/aria2/aria2>
- `curl_impersonate` - <https://github.com/yifeikong/curl-impersonate> (via <https://github.com/yifeikong/curl_cffi>)
- `n_m3u8dl_re` - <https://github.com/nilaoda/N_m3u8DL-RE>
Note that aria2c can reach the highest speeds as it utilizes threading and more connections than the other downloaders. However, aria2c can also be one of the more unstable downloaders. It will work one day, then not another day. It also does not support HTTP(S) proxies while the other downloaders do.
Example mapping:
```yaml
downloader:
NF: requests
AMZN: n_m3u8dl_re
DSNP: n_m3u8dl_re
default: requests
```
The `default` entry is optional. If omitted, `requests` will be used for services not listed.
## decryption (str | dict)
Choose what software to use to decrypt DRM-protected content throughout unshackle where needed.
You may provide a single decryption method globally or a mapping of service tags to
decryption methods.
Options:
- `shaka` (default) - Shaka Packager - <https://github.com/shaka-project/shaka-packager>
- `mp4decrypt` - mp4decrypt from Bento4 - <https://github.com/axiomatic-systems/Bento4>
Note that Shaka Packager is the traditional method and works with most services. mp4decrypt
is an alternative that may work better with certain services that have specific encryption formats.
Example mapping:
```yaml
decryption:
ATVP: mp4decrypt
AMZN: shaka
default: shaka
```
The `default` entry is optional. If omitted, `shaka` will be used for services not listed.
Simple configuration (single method for all services):
```yaml
decryption: mp4decrypt
```
## filenames (dict)
Override the default filenames used across unshackle.
The filenames use various variables that are replaced during runtime.
The following filenames are available and may be overridden:
- `log` - Log filenames. Uses `{name}` and `{time}` variables.
- `config` - Service configuration filenames.
- `root_config` - Root configuration filename.
- `chapters` - Chapter export filenames. Uses `{title}` and `{random}` variables.
- `subtitle` - Subtitle export filenames. Uses `{id}` and `{language}` variables.
For example,
```yaml
filenames:
log: "unshackle_{name}_{time}.log"
config: "config.yaml"
root_config: "unshackle.yaml"
chapters: "Chapters_{title}_{random}.txt"
subtitle: "Subtitle_{id}_{language}.srt"
```
## headers (dict)
Case-Insensitive dictionary of headers that all Services begin their Request Session state with.
All requests will use these unless changed explicitly or implicitly via a Server response.
These should be sane defaults and anything that would only be useful for some Services should not
be put here.
Avoid headers like 'Accept-Encoding' as that would be a compatibility header that Python-requests will
set for you.
I recommend using,
```yaml
Accept-Language: "en-US,en;q=0.8"
User-Agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.75 Safari/537.36"
```
## key_vaults (list\[dict])
Key Vaults store your obtained Content Encryption Keys (CEKs) and Key IDs per-service.
This can help reduce unnecessary License calls even during the first download. This is because a Service may
provide the same Key ID and CEK for both Video and Audio, as well as for multiple resolutions or bitrates.
You can have as many Key Vaults as you would like. It's nice to share Key Vaults or use a unified Vault on
Teams as sharing CEKs immediately can help reduce License calls drastically.
Three types of Vaults are in the Core codebase, API, SQLite and MySQL. API makes HTTP requests to a RESTful API,
whereas SQLite and MySQL directly connect to an SQLite or MySQL Database.
Note: SQLite and MySQL vaults have to connect directly to the Host/IP. It cannot be in front of a PHP API or such.
Beware that some Hosting Providers do not let you access the MySQL server outside their intranet and may not be
accessible outside their hosting platform.
Additional behavior:
- `no_push` (bool): Optional per-vault flag. When `true`, the vault will not receive pushed keys (writes) but
will still be queried and can provide keys for lookups. Useful for read-only/backup vaults.
### Using an API Vault
API vaults use a specific HTTP request format, therefore API or HTTP Key Vault APIs from other projects or services may
not work in unshackle. The API format can be seen in the [API Vault Code](unshackle/vaults/API.py).
```yaml
- type: API
name: "John#0001's Vault" # arbitrary vault name
uri: "https://key-vault.example.com" # api base uri (can also be an IP or IP:Port)
# uri: "127.0.0.1:80/key-vault"
# uri: "https://api.example.com/key-vault"
token: "random secret key" # authorization token
# no_push: true # optional; make this API vault read-only (lookups only)
```
### Using a MySQL Vault
MySQL vaults can be either MySQL or MariaDB servers. I recommend MariaDB.
A MySQL Vault can be on a local or remote network, but I recommend SQLite for local Vaults.
```yaml
- type: MySQL
name: "John#0001's Vault" # arbitrary vault name
host: "127.0.0.1" # host/ip
# port: 3306 # port (defaults to 3306)
database: vault # database used for unshackle
username: jane11
password: Doe123
# no_push: false # optional; defaults to false
```
I recommend giving only a trustable user (or yourself) CREATE permission and then use unshackle to cache at least one CEK
per Service to have it create the tables. If you don't give any user permissions to create tables, you will need to
make tables yourself.
- Use a password on all user accounts.
- Never use the root account with unshackle (even if it's you).
- Do not give multiple users the same username and/or password.
- Only give users access to the database used for unshackle.
- You may give trusted users CREATE permission so unshackle can create tables if needed.
- Other uses should only be given SELECT and INSERT permissions.
### Using an SQLite Vault
SQLite Vaults are usually only used for locally stored vaults. This vault may be stored on a mounted Cloud storage
drive, but I recommend using SQLite exclusively as an offline-only vault. Effectively this is your backup vault in
case something happens to your MySQL Vault.
```yaml
- type: SQLite
name: "My Local Vault" # arbitrary vault name
path: "C:/Users/Jane11/Documents/unshackle/data/key_vault.db"
# no_push: true # optional; commonly true for local backup vaults
```
**Note**: You do not need to create the file at the specified path.
SQLite will create a new SQLite database at that path if one does not exist.
Try not to accidentally move the `db` file once created without reflecting the change in the config, or you will end
up with multiple databases.
If you work on a Team I recommend every team member having their own SQLite Vault even if you all use a MySQL vault
together.
## muxing (dict)
- `set_title`
Set the container title to `Show SXXEXX Episode Name` or `Movie (Year)`. Default: `true`
## n_m3u8dl_re (dict)
Configuration for N_m3u8DL-RE downloader. This downloader is particularly useful for HLS streams.
- `thread_count`
Number of threads to use for downloading. Default: Uses the same value as max_workers from the command.
- `ad_keyword`
Keyword to identify and potentially skip advertisement segments. Default: `None`
- `use_proxy`
Whether to use proxy when downloading. Default: `true`
For example,
```yaml
n_m3u8dl_re:
thread_count: 16
ad_keyword: "advertisement"
use_proxy: true
```
## nordvpn (dict)
**Legacy configuration. Use `proxy_providers.nordvpn` instead.**
Set your NordVPN Service credentials with `username` and `password` keys to automate the use of NordVPN as a Proxy
system where required.
You can also specify specific servers to use per-region with the `server_map` key.
Sometimes a specific server works best for a service than others, so hard-coding one for a day or two helps.
For example,
```yaml
nordvpn:
username: zxqsR7C5CyGwmGb6KSvk8qsZ # example of the login format
password: wXVHmht22hhRKUEQ32PQVjCZ
server_map:
us: 12 # force US server #12 for US proxies
```
The username and password should NOT be your normal NordVPN Account Credentials.
They should be the `Service credentials` which can be found on your Nord Account Dashboard.
Note that `gb` is used instead of `uk` to be more consistent across regional systems.
## proxy_providers (dict)
Enable external proxy provider services. These proxies will be used automatically where needed as defined by the
Service's GEOFENCE class property, but can also be explicitly used with `--proxy`. You can specify which provider
to use by prefixing it with the provider key name, e.g., `--proxy basic:de` or `--proxy nordvpn:de`. Some providers
support specific query formats for selecting a country/server.
### basic (dict[str, str|list])
Define a mapping of country to proxy to use where required.
The keys are region Alpha 2 Country Codes. Alpha 2 Country Codes are `[a-z]{2}` codes, e.g., `us`, `gb`, and `jp`.
Don't get this mixed up with language codes like `en` vs. `gb`, or `ja` vs. `jp`.
Do note that each key's value can be a list of strings, or a string. For example,
```yaml
us:
- "http://john%40email.tld:password123@proxy-us.domain.tld:8080"
- "http://jane%40email.tld:password456@proxy-us.domain2.tld:8080"
de: "https://127.0.0.1:8080"
```
Note that if multiple proxies are defined for a region, then by default one will be randomly chosen.
You can choose a specific one by specifying it's number, e.g., `--proxy basic:us2` will choose the
second proxy of the US list.
### nordvpn (dict)
Set your NordVPN Service credentials with `username` and `password` keys to automate the use of NordVPN as a Proxy
system where required.
You can also specify specific servers to use per-region with the `server_map` key.
Sometimes a specific server works best for a service than others, so hard-coding one for a day or two helps.
For example,
```yaml
username: zxqsR7C5CyGwmGb6KSvk8qsZ # example of the login format
password: wXVHmht22hhRKUEQ32PQVjCZ
server_map:
us: 12 # force US server #12 for US proxies
```
The username and password should NOT be your normal NordVPN Account Credentials.
They should be the `Service credentials` which can be found on your Nord Account Dashboard.
Once set, you can also specifically opt in to use a NordVPN proxy by specifying `--proxy=gb` or such.
You can even set a specific server number this way, e.g., `--proxy=gb2366`.
Note that `gb` is used instead of `uk` to be more consistent across regional systems.
### surfsharkvpn (dict)
Enable Surfshark VPN proxy service using Surfshark Service credentials (not your login password).
You may pin specific server IDs per region using `server_map`.
```yaml
username: your_surfshark_service_username # https://my.surfshark.com/vpn/manual-setup/main/openvpn
password: your_surfshark_service_password # service credentials, not account password
server_map:
us: 3844 # force US server #3844
gb: 2697 # force GB server #2697
au: 4621 # force AU server #4621
```
### hola (dict)
Enable Hola VPN proxy service. This is a simple provider that doesn't require configuration.
For example,
```yaml
proxy_providers:
hola: {}
```
Note: Hola VPN is automatically enabled when proxy_providers is configured, no additional setup is required.
## remote_cdm (list\[dict])
Use [pywidevine] Serve-compliant Remote CDMs in unshackle as if it was a local widevine device file.
The name of each defined device maps as if it was a local device and should be used like a local device.
For example,
```yaml
- name: chromecdm_903_l3 # name must be unique for each remote CDM
# the device type, system id and security level must match the values of the device on the API
# if any of the information is wrong, it will raise an error, if you do not know it ask the API owner
device_type: CHROME
system_id: 1234
security_level: 3
host: "http://xxxxxxxxxxxxxxxx/the_cdm_endpoint"
secret: "secret/api key"
device_name: "remote device to use" # the device name from the API, usually a wvd filename
```
[pywidevine]: https://github.com/rlaphoenix/pywidevine
## scene_naming (bool)
Set scene-style naming for titles. When `true` uses scene naming patterns (e.g., `Prime.Suspect.S07E01...`), when
`false` uses a more human-readable style (e.g., `Prime Suspect S07E01 ...`). Default: `true`.
## series_year (bool)
Whether to include the series year in series names for episodes and folders. Default: `true`.
## serve (dict)
Configuration data for pywidevine's serve functionality run through unshackle.
This effectively allows you to run `unshackle serve` to start serving pywidevine Serve-compliant CDMs right from your
local widevine device files.
- `api_secret` - Secret key for REST API authentication. When set, enables the REST API server alongside the CDM serve functionality. This key is required for authenticating API requests.
For example,
```yaml
api_secret: "your-secret-key-here"
users:
secret_key_for_jane: # 32bit hex recommended, case-sensitive
devices: # list of allowed devices for this user
- generic_nexus_4464_l3
username: jane # only for internal logging, users will not see this name
secret_key_for_james:
devices:
- generic_nexus_4464_l3
username: james
secret_key_for_john:
devices:
- generic_nexus_4464_l3
username: john
# devices can be manually specified by path if you don't want to add it to
# unshackle's WVDs directory for whatever reason
# devices:
# - 'C:\Users\john\Devices\test_devices_001.wvd'
```
## services (dict)
Configuration data for each Service. The Service will have the data within this section merged into the `config.yaml`
before provided to the Service class.
Think of this config to be used for more sensitive configuration data, like user or device-specific API keys, IDs,
device attributes, and so on. A `config.yaml` file is typically shared and not meant to be modified, so use this for
any sensitive configuration data.
The Key is the Service Tag, but can take any arbitrary form for its value. It's expected to begin as either a list or
a dictionary.
For example,
```yaml
NOW:
client:
auth_scheme: MESSO
# ... more sensitive data
```
## set_terminal_bg (bool)
Controls whether unshackle should set the terminal background color. Default: `false`
For example,
```yaml
set_terminal_bg: true
```
## tag (str)
Group or Username to postfix to the end of all download filenames following a dash.
For example, `tag: "J0HN"` will have `-J0HN` at the end of all download filenames.
## tag_group_name (bool)
Enable/disable tagging downloads with your group name when `tag` is set. Default: `true`.
## tag_imdb_tmdb (bool)
Enable/disable tagging downloaded files with IMDB/TMDB/TVDB identifiers (when available). Default: `true`.
## title_cache_enabled (bool)
Enable/disable caching of title metadata to reduce redundant API calls. Default: `true`.
## title_cache_time (int)
Cache duration in seconds for title metadata. Default: `1800` (30 minutes).
## title_cache_max_retention (int)
Maximum retention time in seconds for serving slightly stale cached title metadata when API calls fail.
Default: `86400` (24 hours). Effective retention is `min(title_cache_time + grace, title_cache_max_retention)`.
## tmdb_api_key (str)
API key for The Movie Database (TMDB). This is used for tagging downloaded files with TMDB,
IMDB and TVDB identifiers. Leave empty to disable automatic lookups.
To obtain a TMDB API key:
1. Create an account at <https://www.themoviedb.org/>
2. Go to <https://www.themoviedb.org/settings/api> to register for API access
3. Fill out the API application form with your project details
4. Once approved, you'll receive your API key
For example,
```yaml
tmdb_api_key: cf66bf18956kca5311ada3bebb84eb9a # Not a real key
```
**Note**: Keep your API key secure and do not share it publicly. This key is used by the core/utils/tags.py module to fetch metadata from TMDB for proper file tagging.
## subtitle (dict)
Control subtitle conversion and SDH (hearing-impaired) stripping behavior.
- `conversion_method`: How to convert subtitles between formats. Default: `auto`.
- `auto`: Use subby for WebVTT/SAMI, standard for others.
- `subby`: Always use subby with CommonIssuesFixer.
- `subtitleedit`: Prefer SubtitleEdit when available; otherwise fallback to standard conversion.
- `pycaption`: Use only the pycaption library (no SubtitleEdit, no subby).
- `pysubs2`: Use pysubs2 library (supports SRT, SSA, ASS, WebVTT, TTML, SAMI, MicroDVD, MPL2, TMP formats).
- `sdh_method`: How to strip SDH cues. Default: `auto`.
- `auto`: Try subby for SRT first, then SubtitleEdit, then filter-subs.
- `subby`: Use subby's SDHStripper (SRT only).
- `subtitleedit`: Use SubtitleEdit's RemoveTextForHI when available.
- `filter-subs`: Use the subtitle-filter library.
Example:
```yaml
subtitle:
conversion_method: auto
sdh_method: auto
```
## update_checks (bool)
Check for updates from the GitHub repository on startup. Default: `true`.
## update_check_interval (int)
How often to check for updates, in hours. Default: `24`.

45
CONTRIBUTING.md Normal file
View File

@ -0,0 +1,45 @@
# Development
This project is managed using [UV](https://github.com/astral-sh/uv), a fast Python package and project manager.
Install the latest version of UV before continuing. Development currently requires Python 3.9+.
## Set up
Starting from Zero? Not sure where to begin? Here's steps on setting up this Python project using UV. Note that
UV installation instructions should be followed from the UV Docs: https://docs.astral.sh/uv/getting-started/installation/
1. Clone the Repository:
```shell
git clone https://github.com/unshackle-dl/unshackle
cd unshackle
```
2. Install the Project with UV:
```shell
uv sync
```
This creates a Virtual environment and then installs all project dependencies and executables into the Virtual
environment. Your System Python environment is not affected at all.
3. Run commands in the Virtual environment:
```shell
uv run unshackle
```
Note:
- UV automatically manages the virtual environment for you - no need to manually activate it
- You can use `uv run` to prefix any command you wish to run under the Virtual environment
- For example: `uv run unshackle --help` to run the main application
- JetBrains PyCharm and Visual Studio Code will automatically detect the UV-managed virtual environment
- For more information, see: https://docs.astral.sh/uv/concepts/projects/
4. Install Pre-commit tooling to ensure safe and quality commits:
```shell
uv run pre-commit install
```

674
LICENSE Normal file
View File

@ -0,0 +1,674 @@
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
<program> Copyright (C) <year> <name of author>
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
<https://www.gnu.org/licenses/>.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
<https://www.gnu.org/licenses/why-not-lgpl.html>.

835
OLD-CHANGELOG.md Normal file
View File

@ -0,0 +1,835 @@
# Changelog
This is Devine's Original Changelog, kept this here for historical reasons.
All notable changes to this project will be documented in this file.
This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
Versions [3.0.0] and older use a format based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
but versions thereafter use a custom changelog format using [git-cliff](https://git-cliff.org).
## [3.3.3] - 2024-05-07
### Bug Fixes
- *dl*: Automatically convert TTML Subs to WebVTT for MKV support
- *Subtitle*: Correct timestamps when merging fragmented WebVTT
### Changes
- *env*: List all directories as table in info
- *env*: List possible config path locations when not found
- *binaries*: Move all binary definitions to core/binaries file
- *curl-impersonate*: Remove manual fix for curl proxy SSL
- *curl-impersonate*: Update the default browser to chrome124
- *Config*: Move possible config paths out of func to constant
- *utilities*: Remove get_binary_path, use binaries.find instead
- *dl*: Improve readability of download worker errors
- *env*: Shorten paths on Windows with env vars
## [3.3.2] - 2024-04-16
### Bug Fixes
- *Video*: Ensure track is supported in change_color_range()
- *Video*: Optionalise constructor args, add doc-string & checks
- *Audio*: Optionalise constructor args, add doc-string & checks
- *Subtitle*: Optionalise constructor args, add doc-string & checks
- *HLS*: Ensure playlist.stream_info.codecs exists before use
- *HLS*: Ensure playlist.stream_info.resolution exists before use
- *env*: List used config path, otherwise the default path
- *cfg*: Use loaded config path instead of hardcoded default
- *Basic*: Return None not Exception if no proxy configured
### Changes
- *Video*: Do not print "?"/"Unknown" values in str()
- *Audio*: Do not print "?"/"Unknown" values in str()
- *Subtitle*: Do not print "?"/"Unknown" values in str()
- *Audio*: List lang after codec for consistency with other Tracks
- *Video*: Return None if no m3u RANGE, not SDR
- *env*: Use -- to indicate no config found/loaded
### New Contributors
- [retouching](https://github.com/retouching)
## [3.3.1] - 2024-04-05
### Features
- *dl*: Add *new* --workers to set download threads/workers
### Bug Fixes
- *Chapter*: Cast values to int prior to formatting
- *requests*: Fix multithreaded downloads
- *Events*: Dereference subscription store from ephemeral store
### Changes
- *dl*: Change --workers to --downloads
### New Contributors
- [knowhere01](https://github.com/knowhere01)
## [3.3.0] - 2024-04-02
### Features
- Add support for MKV Attachments via Attachment class
- *dl*: Automatically attach fonts used within SSAv4 subs
- *dl*: Try find SSAv4 fonts in System OS fonts folder
- *Basic*: Allow single string URIs for countries
- *Basic*: Allow proxy selection by index (one-indexed)
- *Events*: Add new global Event Observer API
### Bug Fixes
- *curl-impersonate*: Set Cert-Authority Bundle for HTTPS Proxies
- *Basic*: Make query case-insensitive
- *WVD*: Ensure WVDs dir exists before moving WVD file
- *WVD*: Fix empty path to WVDs folder check
- *WVD*: Move log out of loop to save performance
- *WVD*: Move log with path before Device load
- *WVD*: Add exists/empty checks to WVD folder dumps
- *Basic*: Fix variable typo regression
### Changes
- *Basic*: Improve proxy format checks
- *WVD*: Print error if path to parse doesn't exist
- *WVD*: Seperate logs in loop for visual clarity
- *Track*: Move from OnXyz callables to Event observer
## [3.2.0] - 2024-03-25
### Features
- *ClearKey*: Pass session not proxy str in from_m3u_key method
- *Track*: Allow Track to choose downloader to use
- *search*: New Search command, Service method, SearchResult Class
### Bug Fixes
- *dl*: Include chapters when muxing
- *aria2c*: Support aria2(c) 1.37.0 by handling upstream regression
- *MultipleChoice*: Simplify super() call and value types
- *dl*: Add single mux job if there's no video tracks
- *Track*: Compute Track ID from the `this` variable, not `self`
- *DASH/HLS*: Don't merge folders, skip final merge if only 1 segment
- *dl*: Use click.command() instead of click.group()
- *HLS*: Remove save dir even if final merge wasn't needed
- *Track*: Fix order of operation mistake in get_track_name
- *requests*: Set HTTP pool connections/maxsize to max workers
- *Video*: Delete original file after using change_color_range()
- *Video*: Delete original file after using remove_eia_cc()
- *requests*: Manually compute default max_workers or pool size is None
- *requests*: Block until connection freed if too many connections
- *HLS*: Delete subtitle segments as they are merged
- *HLS*: Delete video/audio segments after FFmpeg merge
### Changes
- *ClearKey*: Only use User-Agent if none set in from_m3u_key
- *Track*: Remove TERRITORY_MAP constant, trim SAR China manually
- *Track*: Default the track name to it's lang's script/territory
- *Service*: Go back to the default pool_maxsize in Session
## [3.1.0] - 2024-03-05
### Features
- *cli*: Implement MultipleChoice click param based on Choice param
- *dl*: Skip video lang filter if --v-lang unused & only 1 video lang
- *dl*: Change --vcodec default to None, use any codec
- *dl*: Support multiple -r/--range and mux ranges separately
- *Subtitle*: Convert from fTTML->TTML & fVTT->WebVTT post-download
- *Track*: Make ID optional, Automatically compute one if not provided
- *Track*: Add a name property to use for the Track Name
### Bug Fixes
- *dl*: Have --sub-format default to None to keep original sub format
- *HLS*: Use filtered out segment key info
- *Track*: Don't modify lang when getting name
- *Track*: Don't use fallback values "Zzzz"/"ZZ" for track name
- *version*: The `__version__` variable forgot to be updated
### Changes
- Move dl command's download_track() to Track.download()
- *dl*: Remove unused `get_profiles()` method
- *DASH*: Move data values from track url to track data property
- *DASH*: Change how Video FPS is gotten to remove FutureWarning log
- *Track*: Add type checks, improve typing
- *Track*: Remove swap() method and it's uses
- *Track*: Remove unused DRM enum
- *Track*: Rename Descriptor's M3U & MPD to HLS & DASH
- *Track*: Remove unnecessary bool casting
- *Track*: Move the path class instance variable with the rest
- *Track*: Return new path on move(), raise exceptions on errors
- *Track*: Move delete and move methods near start of Class
- *Track*: Rename extra to data, enforce type as dict
### Builds
- Explicitly use marisa-trie==1.1.0 for Python 3.12 wheels
## [3.0.0] - 2024-03-01
### Added
- Support for Python 3.12.
- Audio track's Codec Enum now has [FLAC](https://en.wikipedia.org/wiki/FLAC) defined.
- The Downloader to use can now be set in the config under the [downloader key](CONFIG.md#downloader-str).
- New Multi-Threaded Downloader, `requests`, that makes HTTP(S) calls using [Python-requests](https://requests.readthedocs.io).
- New Multi-Threaded Downloader, `curl_impersonate`, that makes HTTP(S) calls using [Curl-Impersonate](https://github.com/yifeikong/curl-impersonate) via [Curl_CFFI](https://github.com/yifeikong/curl_cffi).
- HLS manifests specifying a Byte range value without starting offsets are now supported.
- HLS segments that use `EXT-X-DISCONTINUITY` are now supported.
- DASH manifests with SegmentBase or only BaseURL are now supported.
- Subtitle tracks from DASH manifests now automatically marked as SDH if `urn:tva:metadata:cs:AudioPurposeCS:2007 = 2`.
- The `--audio-only/--subs-only/--chapters-only` flags can now be used simultaneously. For example, `--subs-only`
with `--chapters-only` will get just Subtitles and Chapters.
- Added `--video-only` flag, which can also still be simultaneously used with the only "only" flags. Using all four
of these flags will have the same effect as not using any of them.
- Added `--no-proxy` flag, disabling all uses of proxies, even if `--proxy` is set.
- Added `--sub-format` option, which sets the wanted output subtitle format, defaulting to SubRip (SRT).
- Added `Subtitle.reverse_rtl()` method to use SubtitleEdit's `/ReverseRtlStartEnd` functionality.
- Added `Subtitle.convert()` method to convert the loaded Subtitle to another format. Note that you cannot convert to
fTTML or fVTT, but you can convert from them. SubtitleEdit will be used in precedence over pycaption if available.
Converting to SubStationAlphav4 requires SubtitleEdit, but you may want to manually alter the Canvas resolution after
the download.
- Added support for SubRip (SRT) format subtitles in `Subtitle.parse()` via pycaption.
- Added `API` Vault Client aiming for a RESTful like API.
- Added `Chapters` Class to hold the new reworked `Chapter` objects, automatically handling stuff like order of the
Chapters, Chapter numbers, loading from a chapter file or string, and saving to a chapter file or string.
- Added new `chapter_fallback_name` config option allowing you to set a Chapter Name Template used when muxing Chapters
into an MKV Container with MKVMerge. Do note, it defaults to no Chapter Fallback Name at all, but MKVMerge will force
`Chapter {i:02}` at least for me on Windows with the program language set to English. You may want to instead use
`Chapter {j:02}` which will do `Chapter 01, Intro, Chapter 02` instead of `Chapter 01, Intro, Chapter 03` (an Intro
is not a Chapter of story, but it is the 2nd Chapter marker, so It's up to you how you want to interpret it).
- Added new `Track.OnSegmentDownloaded` Event, called any time one of the Track's segments were downloaded.
- Added new `Subtitle.OnConverted` Event, called any time that Subtitle is converted.
- Implemented `__add__` method to `Tracks` class, allowing you to add to the first Tracks object. For example, making
it handy to merge HLS video tracks with DASH tracks, `tracks = dash_tracks + hls_tracks.videos`, or for iterating:
`for track in dash.videos + hls.videos: ...`.
- Added new utility `get_free_port()` to get a free local port to use, though it may be taken by the time it's used.
### Changed
- Moved from my forked release of pymp4 (`rlaphoenix-pymp4`) back to the original `pymp4` release as it is
now up-to-date with some of my needed fixes.
- The DASH manifest is now stored in the Track `url` property to be reused by `DASH.download_track()`.
- Encrypted DASH streams are now downloaded in full and then decrypted, instead of downloading and decrypting
each individual segment. Unlike HLS, DASH cannot dynamically switch out the DRM/Protection information.
This brings both CPU and Disk IOPS improvements, as well as fixing rare weird decryption anomalies like broken
or odd timestamps, decryption failures, or broken a/v continuity.
- When a track is being decrypted, it now displays "Decrypting" and afterward "Decrypted" in place of the download
speed.
- When a track finishes downloaded, it now displays "Downloaded" in place of the download speed.
- When licensing is needed and fails, the track will display "FAILED" in place of the download speed. The track
download will cancel and all other track downloads will be skipped/cancelled; downloading will end.
- The fancy smart quotes (`“` and `”`) are now stripped from filenames.
- All available services are now listed if you provide an invalid service tag/alias.
- If a WVD file fails to load and looks to be in the older unsupported v1 format, then instructions on migrating to
v2 will be displayed.
- If Shaka-Packager prints an error (i.e., `:ERROR:` log message) it will now raise a `subprocess.CalledProcessError`
exception, even if the process return code is 0.
- The Video classes' Primaries, Transfer, and Matrix classes had changes to their enum names to better represent their
values and uses. See the changed names in the [commit](https://github.com/unshackle-dl/unshackle/commit/c159672181ee3bd07b06612f256fa8590d61795c).
- SubRip (SRT) Subtitles no longer have the `MULTI-LANGUAGE SRT` header forcefully removed. The root cause of the error
was identified and fixed in this release.
- Since `Range.Transfer.SDR_BT_601_625 = 5` has been removed, `Range.from_cicp()` now internally remaps CICP transfer
values of `5` to `6` (which is now `Range.Transfer.BT_601 = 6`).
- Referer and User-Agent Header values passed to the aria2(c) downloader is now set via the dedicated `--referer` and
`--user-agent` options respectively, instead of `--header`.
- The aria2(c) `-j`, `-x`, and `-s` option values can now be set by the config under the `aria2c` key in the options'
full names.
- The aria2(c) `-x`, and `-s` option values now use aria2(c)'s own default values for them instead of `16`. The `j`
option value defaults to ThreadPoolExecutor's algorithm of `min(32,(cpu_count+4))`.
- The download progress bar now states `LICENSING` on the speed text when licensing DRM, and `LICENSED` once finished.
- The download progress bar now states `CANCELLING`/`CANCELLED` on the speed text when cancelling downloads. This is to
make it more clear that it didn't just stop, but stopped as it was cancelled.
- The download cancel/skip events were moved to `constants.py` so it can be used across the codebase easier without
argument drilling. `DL_POOL_STOP` was renamed to `DOWNLOAD_CANCELLED` and `DL_POOL_SKIP` to `DOWNLOAD_LICENCE_ONLY`.
- The Cookie header is now calculated for each URL passed to the aria2(c) downloader based on the URL. Instead of
passing every single cookie, which could have two cookies with the same name aimed for different host names, we now
pass only cookies intended for the URL.
- The aria2(c) process no longer prints output to the terminal directly. unshackle now only prints contents of the
captured log messages to the terminal. This allows filtering out of errors and warnings that isn't a problem.
- DASH and HLS no longer download segments silencing errors on all but the last retry as the downloader rework makes
this unnecessary. The errors will only be printed on the final retry regardless.
- `Track.repackage()` now saves as `{name}_repack.{ext}` instead of `{name}.repack.{ext}`.
- `Video.change_color_range()` now saves as `{name}_{limited|full}_range.{ext}` instead of `{name}.range{0|1}.{ext}`.
- `Widevine.decrypt()` now saves as `{name}_decrypted.{ext}` instead of `{name}.decrypted.{ext}`.
- Files starting with the save path's name and using the save path's extension, but not the save path, are no longer
deleted on download finish/stop/failure.
- The output container format is now explicitly specified as `MP4` when calling `shaka-packager`.
- The default downloader is now `requests` instead of `aria2c` to reduce required external dependencies.
- Reworked the `Chapter` class to only hold a timestamp and name value with an ID automatically generated as a CRC32 of
the Chapter representation.
- The `--group` option has been renamed to `--tag`.
- The config file is now read from three more locations in the following order:
1) The unshackle Namespace Folder (e.g., `%appdata%/Python/Python311/site-packages/unshackle/unshackle.yaml`).
2) The Parent Folder to the unshackle Namespace Folder (e.g., `%appdata%/Python/Python311/site-packages/unshackle.yaml`).
3) The AppDirs User Config Folder (e.g., `%localappdata%/unshackle/unshackle.yaml`).
Location 2 allows having a config at the root of a portable folder.
- An empty config file is no longer created when no config file is found.
- You can now set a default cookie file for a Service, [see README](README.md#cookies--credentials).
- You can now set a default credential for a Service, [see config](CONFIG.md#credentials-dictstr-strlistdict).
- Services are now auth-less by default and the error for not having at least a cookie or credential is removed.
Cookies/Credentials will only be loaded if a default one for the service is available, or if you use `-p/--profile`
and the profile exists.
- Subtitles when converting to SubRip (SRT) via SubtitleEdit will now use the `/ConvertColorsToDialog` option.
- HLS segments are now merged by discontinuity instead of all at once. The merged discontinuities are then finally
merged to one file using `ffmpeg`. Doing the final merge by byte concatenation did not work for some playlists.
- The Track is no longer passed through Event Callables. If you are able to set a function on an Even Callable, then
you should have access to the track reference to call it directly if needed.
- The Track.OnDecrypted event callable is now passed the DRM and Segment objects used to Decrypt. The segment object is
only passed from HLS downloads.
- The Track.OnDownloaded event callable is now called BEFORE decryption, right after downloading, not after decryption.
- All generated Track ID values across the codebase has moved from md5 to crc32 values as code processors complain
about its use surrounding security, and it's length is too large for our use case anyway.
- HLS segments are now downloaded multi-threaded first and then processed in sequence thereafter.
- HLS segments are no longer decrypted one-by-one, requiring a lot of shaka-packager processes to run and close.
They now merged and decrypt in groups based on their EXT-X-KEY, before being merged per discontinuity.
- The DASH and HLS downloaders now pass multiple URLs to the downloader instead of one-by-one, heavily increasing speed
and reliability as connections are kept alive and re-used.
- Downloaders now yield back progress information in the same convention used by `rich`'s `Progress.update()` method.
DASH and HLS now pass the yielded information to their progress callable instead of passing the progress callable to
the downloader.
- The aria2(c) downloader now uses the aria2(c) JSON-RPC interface to query for download progress updates instead of
parsing the stdout data in an extremely hacky way.
- The aria2(c) downloader now re-routes non-HTTP proxies via `pproxy` by a subprocess instead of the now-removed
`start_pproxy` utility. This way has proven to be easier, more reliable, and prevents pproxy from messing with rich's
terminal output in strange ways.
- All downloader function's have an altered signature but ultimately similar. `uri` to `urls`, `out` (path) was removed,
we now calculate the save path by passing an `output_dir` and `filename`. The `silent`, `segmented`, and `progress`
parameters were completely removed.
- All downloader `urls` can now be a string or a dictionary containing extra URL-specific options to use like
URL-specific headers. It can also be a list of the two types of URLs to downloading multi-threaded.
- All downloader `filenames` can be a static string, or a filename string template with a few variables to use. The
template system used is f-string, e.g., `"file_{i:03}{ext}"` (ext starts with `.` if there's an extension).
- DASH now updates the progress bar when merging segments.
- The `Widevine.decrypt()` method now also searches for shaka-packager as just `packager` as it is the default build
name. (#74)
### Removed
- The `unshackle auth` command and sub-commands due to lack of support, risk of data, and general quirks with it.
- Removed `profiles` config, you must now specify which profile you wish to use each time with `-p/--profile`. If you
use a specific profile a lot more than others, you should make it the default.
- The `saldl` downloader has been removed as their binary distribution is whack and development has seemed to stall.
It was only used as an alternative to what was at the time the only downloader, aria2(c), as it did not support any
form of Byte Range, but `saldl` did, which was crucial for resuming extremely large downloads or complex playlists.
However, now we have the requests downloader which does support the Range header.
- The `Track.needs_proxy` property was removed for a few design architectural reasons.
1) Design-wise it isn't valid to have --proxy (or via config/otherwise) set a proxy, then unpredictably have it
bypassed or disabled. If I specify `--proxy 127.0.0.1:8080`, I would expect it to use that proxy for all
communication indefinitely, not switch in and out depending on the track or service.
2) With reason 1, it's also a security problem. The only reason I implemented it in the first place was so I could
download faster on my home connection. This means I would authenticate and call APIs under a proxy, then suddenly
download manifests and segments e.t.c under my home connection. A competent service could see that as an indicator
of bad play and flag you.
3) Maintaining this setup across the codebase is extremely annoying, especially because of how proxies are setup/used
by Requests in the Session. There's no way to tell a request session to temporarily disable the proxy and turn it
back on later, without having to get the proxy from the session (in an annoying way) store it, then remove it,
make the calls, then assuming your still in the same function you can add it back. If you're not in the same
function, well, time for some spaghetti code.
- The `Range.Transfer.SDR_BT_601_625 = 5` key and value has been removed as I cannot find any official source to verify
it as the correct use. However, usually a `transfer` value of `5` would be PAL SD material so it better matches `6`,
which is (now named) `Range.Transfer.BT_601 = 6`. If you have something specifying transfer=5, just remap it to 6.
- The warning log `There's no ... Audio Tracks, likely part of an invariant playlist, continuing...` message has been
removed. So long as your playlist is expecting no audio tracks, or the audio is part of the video transport, then
this wouldn't be a problem whatsoever. Therefore, having it log this annoying warning all the time is pointless.
- The `--min-split-size` argument to the aria2(c) downloader as it was only used to disable splitting on
segmented downloads, but the newer downloader system wouldn't really need or want this to be done. If aria2 has
decided based on its other settings to have split a segment file, then it likely would benefit from doing so.
- The `--remote-time` argument from the aria2(c) downloader as it may need to do a GET and a HEAD request to
get the remote time information, slowing the download down. We don't need this information anyway as it will likely
be repacked with `ffmpeg` or multiplexed with `mkvmerge`, discarding/losing that information.
- DASH and HLS's 5-attempt retry loop as the downloaders will retry for us.
- The `start_pproxy` utility has been removed as all uses of it now call `pproxy` via subprocess instead.
- The `LANGUAGE_MUX_MAP` constant and it's usage has been removed as it is no longer necessary as of MKVToolNix v54.
### Fixed
- Uses of `__ALL__` with Class objects have been correct to `__all__` with string objects, following PEP8.
- Fixed value of URL passed to `Track.get_key_id()` as it was a tuple rather than the URL string.
- The `--skip-dl` flag now works again after breaking in v[1.3.0].
- Move WVD file to correct location on new installations in the `wvd add` command.
- Cookie data is now passed to downloaders and use URLs based on the URI it will be used for, just like a browser.
- Failure to get FPS in DASH when SegmentBase isn't used.
- An error message is now returned if a WVD file fails to load instead of raising an exception.
- Track language information within M3U playlists are now validated with langcodes before use. Some manifests use the
property for arbitrary data that their apps/players use for their own purposes.
- Attempt to fix non-UTF-8 and mixed-encoding Subtitle downloads by automatically converting to UTF-8. (#43)
Decoding is attempted in the following order: UTF-8, CP-1252, then finally chardet detection. If it's neither UTF-8
nor CP-1252 and chardet could not detect the encoding, then it is left as-is. Conversion is done per-segment if the
Subtitle is segmented, unless it's the fVTT or fTTML formats which are binary.
- Chapter Character Encoding is now explicitly set to UTF-8 when muxing to an MKV container as Windows seems to default
to latin1 or something, breaking Chapter names with any sort of special character within.
- Subtitle passed through SubtitleEdit now explicitly use UTF-8 character encoding as it usually defaulted to UTF-8
with Byte Order Marks (aka UTF-8-SIG/UTF-8-BOM).
- Subtitles passed through SubtitleEdit now use the same output format as the subtitle being processed instead of SRT.
- Fixed rare infinite loop when the Server hosting the init/header data/segment file responds with a `Content-Length`
header with a value of `0` or smaller.
- Removed empty caption lists/languages when parsing Subtitles with `Subtitle.parse()`. This stopped conversions to SRT
containing the `MULTI-LANGUAGE SRT` header when there was multiple caption lists, even though only one of them
actually contained captions.
- Text-based Subtitle formats now try to automatically convert to UTF-8 when run through `Subtitle.parse()`.
- Text-based Subtitle formats now have `&lrm;` and `&rlm;` HTML entities unescaped post-download as some rendering
libraries seems to not decode them for us. SubtitleEdit also has problems with `/ReverseRtlStartEnd` unless it's
already decoded.
- Fixed two concatenation errors surrounding DASH's BaseURL, sourceURL, and media values that start with or use `../`.
- Fixed the number values in the `Newly added to x/y Vaults` log, which now states `Cached n Key(s) to x/y Vaults`.
- File write handler now flushes after appending a new segment to the final save path or checkpoint file, reducing
memory usage by quite a bit in some scenarios.
### New Contributors
- [Shivelight](https://github.com/Shivelight)
## [2.2.0] - 2023-04-23
### Breaking Changes
Since `-q/--quality` has been reworked to support specifying multiple qualities, the type of this value is
no longer `None|int`. It is now `list[int]` and the list may be empty. It is no longer ever a `None` value.
Please make sure any Service code that uses `quality` via `ctx.parent.params` reflects this change. You may
need to go from an `if quality: ...` to `for res in quality: ...`, or such. You may still use `if quality`
to check if it has 1 or more resolution specified, but make sure that the code within that if tree supports
more than 1 value in the `quality` variable, which is now a list. Note that the list will always be in
descending order regardless of how the user specified them.
### Added
- Added the ability to specify and download multiple resolutions with `-q/--quality`. E.g., `-q 1080p,720p`.
- Added support for DASH manifests that use SegmentList with range values on the Initialization definition (#47).
- Added a check for `uuid` mp4 boxes containing `tenc` box data when getting the Track's Key ID to improve
chances of finding a Key ID.
### Changed
- The download path is no longer printed after each download. The simple reason is it felt unnecessary.
It filled up a fair amount of vertical space for information you should already know.
- The logs after a download finishes has been split into two logs. One after the actual downloading process
and the other after the multiplexing process. The downloading process has its own timer as well, so you can
see how long the downloads itself took.
- I've switched from using the official pymp4 (for now) with my fork. At the time this change was made the
original bearypig pymp4 repo was stagnant and the PyPI releases were old. I forked it, added some fixes
by TrueDread and released my own update to PyPI, so it's no longer outdated. This was needed for some
mp4 box parsing fixes. Since then the original repo is no longer stagnant, and a new release was made on
PyPI. However, my repo still has some of TrueDread's fixes that is not yet on the original repository nor
on PyPI.
### Removed
- Removed the `with_resolution` method in the Tracks class. It has been replaced with `by_resolutions`. The
new replacement method supports getting all or n amount of tracks by resolution instead of the original
always getting all tracks by resolution.
- Removed the `select_per_language` method in the Tracks class. It has been replaced with `by_language`. The
new replacement method supports getting all or n amount of tracks by language instead of the original only
able to get one track by language. It now defaults to getting all tracks by language.
### Fixed
- Prevented some duplicate Widevine tree logs under specific edge-cases.
- The Subtitle parse method no longer absorbs the syntax error message.
- Replaced all negative size values with 0 on TTML subtitles as a negative value would cause syntax errors.
- Fixed crash during decryption when shaka-packager skips decryption of a segment as it had no actual data and
was just headers.
- Fixed CCExtractor crash in some scenarios by repacking the video stream prior to extraction.
- Fixed rare crash when calculating download speed of DASH and HLS downloads where a segment immediately finished
after the previous segment. This seemed to only happen on the very last segment in rare situations.
- Fixed some failures parsing `tenc` mp4 boxes when obtaining the track's Key ID by using my own fork of pymp4
with up-to-date code and further fixes.
- Fixed crashes when parsing some `tenc` mp4 boxes by simply skipping `tenc` boxes that fail to parse. This happens
because some services seem to mix up the data of the `tenc` box with that of another type of box.
- Fixed using invalid `tenc` boxes by skipping ones with a version number greater than 1.
## [2.1.0] - 2023-03-16
### Added
- The Track get_init_segment method has been re-written to be more controllable. A specific Byte-range, URL, and
maximum size can now be specified. A manually specified URL will override the Track's current URL. The Byte-range
will override the fallback value of `0-20000` (where 20000 is the default `maximum_size`). It now also checks if the
server supports Byte-range, or it will otherwise stream the response. It also tries to get the file size length and
uses that instead of `maximum_size` unless it's bigger than `maximum_size`.
- Added new `get_key_id` method to Track to probe the track for a track-specific Encryption Key ID. This is similar to
Widevine's `from_track` method but ignores all `pssh` boxes and manifest information as the information within those
could be for a wider range of tracks or not for that track at all.
- Added a 5-attempt retry system to DASH and HLS downloads. URL downloads only uses aria2(c)'s built in retry system
which has the same amount of tries and same delay between attempts. Any errors emitted when downloading segments will
not be printed to console unless it occurred on the last attempt.
- Added a fallback way to obtain language information by taking it from the representation ID value, which may have the
language code within it. E.g., `audio_en=128000` would be an English audio track at 128kb/s. We now take the `en`
from that ID where possible.
- Added support for 13-char JS-style timestamp values to the Cacher system.
- Improved Forced Subtitle recognition by checking for both `forced-subtitle` and `forced_subtitle` (#43).
### Changed
- The `*` symbol is no longer spaced after the Widevine `KID:KEY` when denoting that it is for this specific PSSH.
This reduces wasted vertical space.
- The "aria2 will resume download if the transfer is restarted" logs that occur when aria2(c) handles the CTRL+C break,
and "If there are any errors, then see the log file" logs are now ignored and no longer logged to the console.
- DASH tracks will no longer prepare and license DRM unless it's just about to download. This is to reduce unnecessary
preparation of DRM if the track had been converted to a URL download.
- For a fix listed below, we now use a fork of https://github.com/globocom/m3u8 that fixes a glaring problem with the
EXT-X-KEY parsing system. See <https://github.com/globocom/m3u8/pull/313>.
- The return code when mkvmerge returns an error is now logged with the error message.
- SubtitleEdit has been silenced when using it for SDH stripping.
### Fixed
- Fixed URL joining and Base URL calculations on DASH manifests that use multiple Base URL values.
- URL downloads will now store the chosen DRM before preparing and licensing with the DRM.
- URL downloads will now prepare and license with the DRM if the Track has pre-existing DRM information. Previously it
would only prepare and license DRM if it did not pre-emptively have DRM information before downloading.
- The `*` symbol that indicates that the KID:KEY is for the track being downloaded now uses the new `get_key_id` method
of the track for a more accurate reading.
- License check now ensures if a KEY was returned for the Track instead of all KIDs of the Track's PSSH. This prevents
an issue where the PSSH may have Key IDs for a 720p and 1080p track, yet only a KEY for the 720p track was returned.
It would have then raised an error and stopped the download, even though you are downloading the 720p track and not
the 1080p track, therefore the error was irrelevant.
- Unnecessary duplicate license calls are now prevented in some scenarios where `--cdm-only` is used.
- Fixed accuracy and speed of preparing and licensing DRM on HLS manifests where multiple EXT-X-KEY definitions appear
in the manifest throughout the file. Using <https://github.com/globocom/m3u8/pull/313> we can now accurately get a
list of EXT-X-KEYs mapped to each segment. This is a game changer for HLS manifests that use unique keys for every
single (or most) segments as it would have otherwised needed to initialize (and possibly do network requests) for
100s of EXT-X-KEY information, per segment. This caused downloads of HLS manifests that used a unique key per segment
to slow to a binding crawl, and still not even decrypt correctly as it wouldn't be able to map the correct initialized
key to the correct segment.
- Fixed a regression that incorrectly implemented the OnMultiplex event for Audio and Subtitle tracks causing them to
never trigger. It would instead accidentally have trigger the last Video track's OnMultiplex event instead of the
Audio or Subtitle's event.
- The above fix also fixed the automatic SDH stripping subtitle. Any automatically created SDH->non-SDH subtitle from
prior downloads would not have actually had SDH captions stripped, it would instead be a duplicate subtitle.
### New Contributors
- [Hollander-1908](https://github.com/Hollander-1908)
## [2.0.1] - 2023-03-07
### Added
- Re-added logging support for shaka-packager on errors and warnings. Do note that INFO logs and the 'Insufficient bits
in bitstream for given AVC profile' warning logs are ignored and never printed.
- Added new exceptions to the Widevine DRM class, `CEKNotFound` and `EmptyLicense`.
- Added support for Byte-ranges on HLS init maps.
### Changed
- Now lists the full 'Episode #' text when listing episode titles without an episode name.
- Subprocess exceptions from a download worker no longer prints a traceback. It now only logs the return code. This is
because all subprocess errors during a download is now logged, therefore the full traceback is no longer necessary.
- Aria2(c) no longer pre-allocates file space if segmented. This is to reduce generally unnecessary upfront I/O usage.
- The Widevine DRM class's `get_content_keys` method now raises the new `CEKNotFound` and `EmptyLicense` exceptions not
`ValueError` exceptions.
- The prepare_drm code now raises exceptions where needed instead of `sys.exit(1)`. Callees do not need to make any
changes. The exception should continue to go up the call stack and get handled by the `dl` command.
### Fixed
- Fixed regression that broke support for pproxy. Do note that while pproxy has wheel's for Python 3.11+, it seems to
be broken. I recommend using Python 3.10 or older for now. See <https://github.com/qwj/python-proxy/issues/161>.
- Fixed regression and now store the chosen DRM object back to the track.drm field. Please note that using the track
DRM field in Service code is not recommended, but for some services it's simply required.
- Fixed regression since v1.4.0 where the byte-range calculation was actually slightly off one on the right-side range.
This was a one-indexed vs. zero-indexed problem. Please note that this could have affected the integrity of HLS
downloads if they used EXT-X-BYTERANGE.
- Fixed possible soft-lock in HLS if the Queue for previous segment key and init data gets stuck in an empty state over
an exception in a download thread. E.g., if a thread takes the previous segment key, throws an exception, and did not
get the chance to give it back for the next thread.
- The prepare_drm function now handles unexpected exceptions raised in the Service's license method. This code would of
otherwise been absorbed and the download would have soft-locked.
- Prevented a double-licensing call race-condition on HLS tracks by using a threading lock when preparing DRM
information. This is not required in DASH, as it prepares DRM on the main thread, once, not per-segment.
- Fixed printing of aria2(c) logs when redirecting progress information to rich progress bars.
- Explicitly mark DASH and HLS aria2(c) downloads as segmented.
- Fixed listing of episode titles without an episode name.
- Fixed centering of the project URL in the ASCII banner.
- Removed the accidental double-newline after the ASCII banner.
## [2.0.0] - 2023-03-01
This release brings a huge change to the fundamentals of unshackle's logging, UI, and UX.
### Added
- Add new dependency [rich](https://github.com/Textualize/rich) for advanced color and logging capabilities.
- Set rich console output color scheme to the [Catppuccin Mocha](https://github.com/catppuccin/palette) theme.
- Add full download cancellation support by using CTRL+C. Track downloads will now be marked as STOPPED if you press
CTRL+C to stop the download, or FAILED if any unexpected exception occurs during a download. The track will be marked
as SKIPPED if the download stopped or failed before it got a chance to begin. It will print a download cancelled
message if downloading was stopped, or a download error message if downloading failed. It will print the first
download error traceback with rich before stopping.
- Downloads will now automatically cancel if any track or segment download fails.
- Implement sub-commands `add` and `delete` to the `wvd` command for adding and deleting WVD (Widevine Device) files to
and from the configured WVDs directory (#31).
- Add new config option to disable the forced background color. You may want to disable the purple background if you're
terminal isn't able to apply it correctly, or you prefer to use your own terminal's background color.
- Create `ComfyConsole`, `ComfyLogRenderer`, and `ComfyRichHandler`. These are hacky classes to implement padding to
the left and right of all rich console output. This gives unshackle a comfortable and freeing look-and-feel.
- An ASCII banner is now displayed at the start of software execution with the version number.
- Add rich status output to various parts of the download process. It's also used when checking GEOFENCE within the
base Service class. I encourage you to follow similar procedures where possible in Service code. This will result in
cleaner log output, and overall less logs being made when finished.
- Add three rich horizontal rules to separate logs during the download process. The Service used, the Title received
from `get_titles()`, and then the Title being downloaded. This helps identify which logs are part of which process.
- Add new `tree` methods to `Series`, `Movies`, and `Album` classes to list items within the objects with Rich Tree.
This allows for more rich console output when displaying E.g., Seasons and Episodes within a Series, or Songs within
an Album.
- Add new `tree` method to the `Tracks` class to list the tracks received from `get_tracks()` with Rich Tree. Similar
to the change just above, this allows for more rich console output. It has replaced the `Tracks.print()` method.
- Add a rich progress bar to the track multiplexing operation.
- Add a log when a download finishes, how long it took, and where the final muxed file was moved to.
- Add a new track event, `OnMultiplex`. This event is run prior to multiplexing the finalized track data together. Use
this to run code once a track has finished downloading and all the post-download operations.
- Add support for mapping Netflix profiles beginning with `h264` to AVC. E.g., the new -QC profiles.
- Download progress bars now display the download speed. It displays in decimal (^1024) size. E.g., MB/s.
- If a download stops or fails, any residual file that may have been downloaded in an incomplete OR complete state will
now be deleted. Download continuation is not yet supported, and this will help to reduce leftover stale files.
### Changed
- The logging base config now has `ComfyRichHandler` as its log handler for automatic rich console output when using
the logging system.
- The standard `traceback` module has been overridden with `rich.traceback` for styled traceback output.
- Only the rich console output is now saved when using `--log`.
- All `tqdm` progress bars have been replaced with rich progress bars. The rich progress bars are now displayed under
each track tree.
- The titles are now only listed if `--list-titles` is used. Otherwise, only a brief explanation of what it received
from `get_titles()` will be returned. E.g., for Series it will list how many seasons and episodes were received.
- Similarly, all available tracks are now only listed if `--list` is used. This is to reduce unnecessary prints, and to
separate confusion between listings of available tracks, and listings of tracks that are going to be downloaded.
- Listing all available tracks with `--list` no longer continues execution. It now stops after the first list. If you
want to list available tracks for a specific title, use `-w` in combination with `--list`.
- The available tracks are now printed in a rich panel with a header denoting the tracks as such.
- The `Series`, `Movies`, and `Album` classes now have a much more simplified string representation. They now simply
state the overarching content within them. E.g., Series says the title and year of the TV Show.
- The final log when all titles are processed is now a rich log and states how long the entire process took.
- Widevine DRM license information is now printed below the tracks as a rich tree.
- The CCExtractor process, Subtitle Conversion process, and FFmpeg Repacking process were all moved out of the track
download function (and therefore the thread) to be done on the main thread after downloading. This improves download
speed as the threads can close and be freed quicker for the next track to begin.
- The CCExtractor process is now optional and will be skipped if the binary could not be found. An error is still
logged in the cases where it would have run.
- The execution point of the `OnDownloaded` event has been moved to directly run after the stream has been downloaded.
It used to run after all the post-download operations finished like CCExtractor, FFmpeg Repacking, and Subtitle
Conversion.
- The automatic SDH-stripped subtitle track now uses the new `OnMultiplex` event instead of `OnDownloaded`. This is to
account for the previous change as it requires the subtitle to be first converted to SubRip to support SDH-stripping.
- Logs during downloads now appear before the downloading track list. This way it isn't constantly interrupting view of
the progress.
- Now running aria2(c) with normal subprocess instead of through asyncio. This removes the creation of yet another
thread which is unnecessary as these calls would have already been under a non-main thread.
- Moved Widevine DRM licensing calls before the download process for normal URL track downloads.
- Segment Merging code for DASH and HLS downloads have been moved from the `dl` class to the HLS and DASH class.
### Removed
- Remove explicit dependency on `coloredlogs` and `colorama` as they are no longer used by unshackle itself.
- Remove dependency `tqdm` as it was replaced with rich progress bars.
- Remove now-unused logging constants like the custom log formats.
- Remove `Tracks.print()` function as it was replaced with the new `Tracks.tree()` function.
- Remove unnecessary sleep calls at the start of threads. This was believed to help with the download stop event check
but that was not the case. It instead added an artificial delay with downloads.
### Fixed
- Fix another crash when using unshackle without a config file. It now creates the directory of the config file before
making a new config file.
- Set the default aria2(c) file-allocation to `prealloc` like stated in the config documentation. It uses `prealloc` as
the default, as `falloc` is generally unsupported in most scenarios, so it's not a good default.
- Correct the config documentation in regard to `proxies` now being called `proxy_providers`, and `basic` actually
being a `dict` of lists, and not a `dict` of strings.
## [1.4.0] - 2023-02-25
### Added
- Add support for byte-ranged HLS and DASH segments, i.e., HLS EXT-X-BYTERANGE and DASH SegmentBase. Byte-ranged
segments will be downloaded using python-requests as aria2(c) does not support byte ranges.
- Added support for data URI scheme in ClearKey DRM, including support for the base64 extension.
### Changed
- Increase the urllib3 connection pool max size from the default 10 to 16 * 2. This is to accommodate up to 16
byte-ranged segment downloads while still giving enough room for a few other connections.
- The urllib3 connection pool now blocks and waits if it's full. This removes the Connection Pool Limit warnings when
downloading more than one byte-ranged segmented track at a time.
- Moved `--log` from the `dl` command to the entry command to allow logging of more than just the download command.
With this change, the logs now include the initial root logs, including the version number.
- Disable the urllib3 InsecureRequestWarnings as these seem to occur when using HTTP+S proxies when connecting to an
HTTPS URL. While not ideal, we can't solve this problem, and the warning logs are quite annoying.
### Removed
- Remove the `byte_range` parameter from the aria2(c) downloader that was added in v1.3.0 as it turns out it doesn't
actually work. Theoretically it should, but it seems aria2(c) doesn't honor the Range header correctly and fails.
### Fixed
- Fix the JOC check on HLS playlists to check if audio channels are defined first.
- Fix decryption of AES-encrypted segments that are not pre-padded to AES-CBC boundary size (16 bytes).
- Fix the order of segment merging on Linux machines. On Windows, the `pathlib.iterdir()` function is always in order.
However, on Linux, or at least some machines, this was not the case.
- Fix printing of the traceback when a download worker raises an unexpected exception.
- Fix initial creation of the config file if none was created yet.
## [1.3.1] - 2023-02-23
### Fixed
- Fixed a regression where the `track.path` was only updated for `Descriptor.URL` downloads if it had DRM. This caused
downloads of subtitles or DRM-free tracks using the `URL` descriptor to be broken (#33).
- Fixed a regression where `title` and `track` were not passed to the Service's functions for getting Widevine Service
Certificates and Widevine Licenses.
- Corrected the Cookie Path that was logged when adding cookies with `unshackle auth add`.
- The Config data is now defaulted to an empty dictionary when completely empty or non-existent. This fixes a crash if
you try to use `unshackle auth add` without a config file.
## [1.3.0] - 2023-02-22
## Deprecated
- Support for Python 3.8 has been dropped. Support for Windows 7 ended in January 2020.
- Although Python 3.8 is the last version with support for Windows 7, the decision was made to drop support because
the number of affected users would be low.
- You may be interested in <https://github.com/adang1345/PythonWin7>, which has newer installers with patched support.
### Added
- Segmented HLS and DASH downloads now provide useful progress information using TQDM. Previously, aria2c would print
progress information, but it was not very useful for segmented downloads due to how the information was presented.
- Segmented HLS and DASH downloads are now manually multi-threaded in a similar way to aria2c's `--j=16`.
- A class-function was added to the Widevine DRM class to obtain PSSH and KID information from init data by looking for
PSSH and TENC boxes. This is an alternative to the from_track class-function when you only have the init data and not
a track object.
- Aria2c now has the ability to silence progress output and provide extra arguments.
### Changed
- The downloading system for HLS and DASH has been completely reworked. It no longer downloads segments, merges them,
and then decrypts. Instead, it now downloads and decrypts each individual segment. It dynamically switches DRM and
Init Data per-segment where needed, fully supporting multiple EXT-X-KEY, EXT-X-MAP, and EXT-X-DISCONTINUITY tags in
HLS. You can now download DRM-encrypted and DRM-free segments from within the same manifest, as well as manifests
with unique DRM per-segment. None of this was possible with the old method of downloading.
- If a HLS manifest or segment uses an EXT-X-KEY with the method of NONE, it is assumed that the manifest or segment is
DRM-free. This behavior applies even if the manifest or segment has other EXT-X-KEY methods specified, as that would
be a mistake in the manifest.
- HLS now uses the proxy when loading AES-128 DRM as ClearKey objects, which is required for some services. It will
only be used if `Track.needs_proxy` is True.
- The Widevine and ClearKey DRM classes decrypt functions no longer ask for a track. Instead, they ask for an input
file path to which it will decrypt. It will automatically delete the input file and put the decrypted data in its
place.
### Removed
- The AtomicSQL utility was removed because it did not actually assist in making the SQL connections thread-safe. It
helped, but in an almost backwards and over-thought approach.
### Fixed
- The Cacher expiration check now uses your local datetime timestamp over the UTC timestamp, which seems to have fixed
early or late expiration if you are not at exactly UTC+00:00.
- The cookies file path is now checked to exist if supplied with the `--cookies` argument (#30).
- An error is now logged, and execution will end if none of the DRM for a HLS manifest or segment is supported.
- HLS now only loads AES-128 EXT-X-KEY methods as ClearKey DRM because it currently only supports AES-128.
- AtomicSQL was replaced with connection factory systems using thread-safe storage for SQL connections. All Vault SQL
calls are now fully thread-safe.
## [1.2.0] - 2023-02-13
### Deprecation Warning
- This release marks the end of support for Python 3.8.x.
- Although version 1.0.0 was intended to support Python 3.8.x, PyCharm failed to warn about a specific type annotation
incompatibility. As a result, I was not aware that the support was not properly implemented.
- This release adds full support for Python 3.8.x, but it will be the only release with such support.
### Added
- The `dl` command CLI now includes Bitrate Selection options: `-vb/--vbitrate` and `-ab/--abitrate`.
- The `dl` command CLI now includes an Audio Channels Selection option: `-c/--channels`.
- If a download worker fails abruptly, a full traceback will now be printed.
- The aria2c downloader has a new parameter for downloading a specific byte range.
### Changed
- The usage of `Path.with_stem` with `Path.with_suffix` has been simplified to `Path.with_name`.
- When printing audio track information, the assumption that the audio is `2.0ch` has been removed.
- If audio channels were previously set as an integer value, they are no longer transformed as e.g., `6ch` and now
follow the normal behavior of being defined as a float value, e.g., `6.0`.
- Audio channels are now explicitly parsed as float values, therefore parsing of values such as `16/JOC` (HLS) is no
longer supported. The HLS manifest parser now assumes the track to be `5.1ch` if the channels value is set to
`.../JOC`.
### Fixed
- Support for Python `>=3.8.6,<3.9.0` has been fixed.
- The final fallback FPS value is now only obtained from the SegmentBase's timescale value if it exists.
- The FutureWarning that occurred when getting Segment URLs from SegmentTemplate DASH manifests has been removed.
- The HLS manifest parser now correctly sets the audio track's `joc` parameter.
- Some Segmented WEBVTT streams may have included the WEBVTT header data when converting to SubRip SRT. This issue has
been fixed by separating the header from any previous caption before conversion.
- The DASH manifest parser now uses the final redirected URL as the manifest URI (#25).
- File move operations from or to different drives (e.g., importing a cookie from another drive in `auth add`) (#27).
### New Contributors
- [Arias800](https://github.com/Arias800)
- [varyg1001](https://github.com/varyg1001)
## [1.1.0] - 2023-02-07
### Added
- Added utility to change the video range flag between full(pc) and limited(tv).
- Added utility to test decoding of video and audio streams using FFmpeg.
- Added CHANGELOG.md
### Changed
- The services and profiles listed by `auth list` are now sorted alphabetically.
- An explicit error is now logged when adding a Cookie to a Service under a duplicate name.
### Fixed
- Corrected the organization name across the project from `unshackle` to `unshackle-dl` as `unshackle` was taken.
- Fixed startup crash if the config was not yet created or was blank.
- Fixed crash when using the `cfg` command to set a config option on new empty config files.
- Fixed crash when loading key vaults during the `dl` command.
- Fixed crash when using the `auth list` command when you do not have a `Cookies` data directory.
- Fixed crash when adding a Cookie using `auth add` to a Service that has no directory yet.
- Fixed crash when adding a Credential using `auth add` when it's the first ever credential, or first for the Service.
## [1.0.0] - 2023-02-06
Initial public release under the name unshackle.
[3.3.3]: https://github.com/unshackle-dl/unshackle/releases/tag/v3.3.3
[3.3.2]: https://github.com/unshackle-dl/unshackle/releases/tag/v3.3.2
[3.3.1]: https://github.com/unshackle-dl/unshackle/releases/tag/v3.3.1
[3.3.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v3.3.0
[3.2.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v3.2.0
[3.1.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v3.1.0
[3.0.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v3.0.0
[2.2.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v2.2.0
[2.1.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v2.1.0
[2.0.1]: https://github.com/unshackle-dl/unshackle/releases/tag/v2.0.1
[2.0.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v2.0.0
[1.4.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v1.4.0
[1.3.1]: https://github.com/unshackle-dl/unshackle/releases/tag/v1.3.1
[1.3.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v1.3.0
[1.2.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v1.2.0
[1.1.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v1.1.0
[1.0.0]: https://github.com/unshackle-dl/unshackle/releases/tag/v1.0.0

109
README.md
View File

@ -1,2 +1,109 @@
# unshackle <p align="center">
<img width="16" height="16" alt="no_encryption" src="https://github.com/user-attachments/assets/6ff88473-0dd2-4bbc-b1ea-c683d5d7a134" /> unshackle
<br/>
<sup><em>Movie, TV, and Music Archival Software</em></sup>
<br/>
<a href="https://discord.gg/mHYyPaCbFK">
<img src="https://img.shields.io/discord/1395571732001325127?label=&logo=discord&logoColor=ffffff&color=7289DA&labelColor=7289DA" alt="Discord">
</a>
</p>
## What is unshackle?
unshackle is a fork of [Devine](https://github.com/devine-dl/devine/), a powerful archival tool for downloading movies, TV shows, and music from streaming services. Built with a focus on modularity and extensibility, it provides a robust framework for content acquisition with support for DRM-protected content.
## Key Features
- 🚀 **Easy Installation** - Simple UV installation
- 🎥 **Multi-Media Support** - Movies, TV episodes, and music
- 🛠️ **Built-in Parsers** - DASH/HLS and ISM manifest support
- 🔒 **DRM Support** - Widevine and PlayReady integration
- 🌈 **HDR10+DV Hybrid** - Hybrid Dolby Vision injection via [dovi_tool](https://github.com/quietvoid/dovi_tool)
- 💾 **Flexible Storage** - Local and remote key vaults
- 👥 **Multi-Profile Auth** - Support for cookies and credentials
- 🤖 **Smart Naming** - Automatic P2P-style filename structure
- ⚙️ **Configurable** - YAML-based configuration
- ❤️ **Open Source** - Fully open-source with community contributions welcome
## Quick Start
### Installation
This installs the latest version directly from the GitHub repository:
```shell
git clone https://github.com/unshackle-dl/unshackle.git
cd unshackle
uv sync
uv run unshackle --help
```
### Install unshackle as a global (per-user) tool
```bash
uv tool install git+https://github.com/unshackle-dl/unshackle.git
# Then run:
uvx unshackle --help # or just `unshackle` once PATH updated
```
> [!NOTE]
> After installation, you may need to add the installation path to your PATH environment variable if prompted.
> **Recommended:** Use `uv run unshackle` instead of direct command execution to ensure proper virtual environment activation.
## Planned Features
- 🖥️ **Web UI Access & Control** - Manage and control unshackle from a modern web interface.
- 🔄 **Sonarr/Radarr Interactivity** - Direct integration for automated personal downloads.
- ⚙️ **Better ISM Support** - Improve on ISM support for multiple services
- 🔉 **ATMOS** - Better Atmos Support/Selection
- 🎵 **Music** - Cleanup Audio Tagging using the [tags.py](unshackle/core/utils/tags.py) for artist/track name etc.
### Basic Usage
```shell
# Check available commands
uv run unshackle --help
# Configure your settings
git clone https://github.com/unshackle-dl/unshackle.git
cd unshackle
uv sync
uv run unshackle --help
# Download content (requires configured services)
uv run unshackle dl SERVICE_NAME CONTENT_ID
```
## Documentation
For comprehensive setup guides, configuration options, and advanced usage:
📖 **[Visit our WIKI](https://github.com/unshackle-dl/unshackle/wiki)**
The WIKI contains detailed information on:
- Service configuration
- DRM configuration
- Advanced features and troubleshooting
For guidance on creating services, see our [WIKI documentation](https://github.com/unshackle-dl/unshackle/wiki).
## End User License Agreement
unshackle and it's community pages should be treated with the same kindness as other projects.
Please refrain from spam or asking for questions that infringe upon a Service's End User License Agreement.
1. Do not use unshackle for any purposes of which you do not have the rights to do so.
2. Do not share or request infringing content; this includes widevine Provision Keys, Content Encryption Keys,
or Service API Calls or Code.
3. The Core codebase is meant to stay Free and Open-Source while the Service code should be kept private.
4. Do not sell any part of this project, neither alone nor as part of a bundle.
If you paid for this software or received it as part of a bundle following payment, you should demand your money
back immediately.
5. Be kind to one another and do not single anyone out.
## Licensing
This software is licensed under the terms of [GNU General Public License, Version 3.0](LICENSE).
You can find a copy of the license in the LICENSE file in the root folder.

71
cliff.toml Normal file
View File

@ -0,0 +1,71 @@
# git-cliff ~ default configuration file
# https://git-cliff.org/docs/configuration
[changelog]
header = """
# Changelog\n
All notable changes to this project will be documented in this file.
This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
Versions [3.0.0] and older use a format based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
but versions thereafter use a custom changelog format using [git-cliff](https://git-cliff.org).\n
"""
body = """
{% if version -%}
## [{{ version | trim_start_matches(pat="v") }}] - {{ timestamp | date(format="%Y-%m-%d") }}
{% else -%}
## [Unreleased]
{% endif -%}
{% for group, commits in commits | group_by(attribute="group") %}
### {{ group | striptags | trim | upper_first }}
{% for commit in commits %}
- {% if commit.scope %}*{{ commit.scope }}*: {% endif %}\
{% if commit.breaking %}[**breaking**] {% endif %}\
{{ commit.message | upper_first }}\
{% endfor %}
{% endfor %}\n
"""
footer = """
{% for release in releases -%}
{% if release.version -%}
{% if release.previous.version -%}
[{{ release.version | trim_start_matches(pat="v") }}]: \
https://github.com/{{ remote.github.owner }}/{{ remote.github.repo }}\
/compare/{{ release.previous.version }}..{{ release.version }}
{% endif -%}
{% else -%}
[unreleased]: https://github.com/{{ remote.github.owner }}/{{ remote.github.repo }}\
/compare/{{ release.previous.version }}..HEAD
{% endif -%}
{% endfor %}
"""
trim = true
postprocessors = [
# { pattern = '<REPO>', replace = "https://github.com/orhun/git-cliff" }, # replace repository URL
]
[git]
conventional_commits = true
filter_unconventional = true
split_commits = false
commit_preprocessors = []
commit_parsers = [
{ message = "^feat", group = "<!-- 0 -->Features" },
{ message = "^fix|revert", group = "<!-- 1 -->Bug Fixes" },
{ message = "^docs", group = "<!-- 2 -->Documentation" },
{ message = "^style", skip = true },
{ message = "^refactor", group = "<!-- 3 -->Changes" },
{ message = "^perf", group = "<!-- 4 -->Performance Improvements" },
{ message = "^test", skip = true },
{ message = "^build", group = "<!-- 5 -->Builds" },
{ message = "^ci", skip = true },
{ message = "^chore", skip = true },
]
protect_breaking_commits = false
filter_commits = false
# tag_pattern = "v[0-9].*"
# skip_tags = ""
# ignore_tags = ""
topo_order = false
sort_commits = "oldest"

61
install.bat Normal file
View File

@ -0,0 +1,61 @@
@echo off
setlocal EnableExtensions EnableDelayedExpansion
echo.
echo === Unshackle setup (Windows) ===
echo.
where uv >nul 2>&1
if %errorlevel% equ 0 (
echo [OK] uv is already installed.
goto install_deps
)
echo [..] uv not found. Installing...
powershell -NoProfile -ExecutionPolicy Bypass -Command "irm https://astral.sh/uv/install.ps1 | iex"
if %errorlevel% neq 0 (
echo [ERR] Failed to install uv.
echo PowerShell may be blocking scripts. Try:
echo Set-ExecutionPolicy RemoteSigned -Scope CurrentUser
echo or install manually: https://docs.astral.sh/uv/getting-started/installation/
pause
exit /b 1
)
set "UV_BIN="
for %%D in ("%USERPROFILE%\.local\bin" "%LOCALAPPDATA%\Programs\uv\bin" "%USERPROFILE%\.cargo\bin") do (
if exist "%%~fD\uv.exe" set "UV_BIN=%%~fD"
)
if not defined UV_BIN (
echo [WARN] Could not locate uv.exe. You may need to reopen your terminal.
) else (
set "PATH=%UV_BIN%;%PATH%"
)
:: Verify
uv --version >nul 2>&1
if %errorlevel% neq 0 (
echo [ERR] uv still not reachable in this shell. Open a new terminal and re-run this script.
pause
exit /b 1
)
echo [OK] uv installed and reachable.
:install_deps
echo.
uv sync
if %errorlevel% neq 0 (
echo [ERR] Dependency install failed. See errors above.
pause
exit /b 1
)
echo.
echo Installation completed successfully!
echo Try:
echo uv run unshackle --help
echo.
pause
endlocal

120
pyproject.toml Normal file
View File

@ -0,0 +1,120 @@
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "unshackle"
version = "2.1.0"
description = "Modular Movie, TV, and Music Archival Software."
authors = [{ name = "unshackle team" }]
requires-python = ">=3.10,<3.13"
readme = "README.md"
license = "GPL-3.0-only"
keywords = [
"python",
"downloader",
"drm",
"widevine",
]
classifiers = [
"Development Status :: 4 - Beta",
"Environment :: Console",
"Intended Audience :: End Users/Desktop",
"Natural Language :: English",
"Operating System :: OS Independent",
"Topic :: Multimedia :: Video",
"Topic :: Security :: Cryptography",
]
dependencies = [
"appdirs>=1.4.4,<2",
"Brotli>=1.1.0,<2",
"click>=8.1.8,<9",
"construct>=2.8.8,<3",
"crccheck>=1.3.0,<2",
"fonttools>=4.0.0,<5",
"jsonpickle>=3.0.4,<5",
"langcodes>=3.4.0,<4",
"lxml>=5.2.1,<7",
"pproxy>=2.7.9,<3",
"protobuf>=4.25.3,<7",
"pycaption>=2.2.6,<3",
"pycryptodomex>=3.20.0,<4",
"pyjwt>=2.8.0,<3",
"pymediainfo>=6.1.0,<8",
"pymp4>=1.4.0,<2",
"pymysql>=1.1.0,<2",
"pywidevine[serve]>=1.8.0,<2",
"PyYAML>=6.0.1,<7",
"requests[socks]>=2.32.5,<3",
"rich>=13.7.1,<15",
"rlaphoenix.m3u8>=3.4.0,<4",
"ruamel.yaml>=0.18.6,<0.19",
"sortedcontainers>=2.4.0,<3",
"subtitle-filter>=1.4.9,<2",
"Unidecode>=1.3.8,<2",
"urllib3>=2.2.1,<3",
"chardet>=5.2.0,<6",
"curl-cffi>=0.7.0b4,<0.14",
"pyplayready>=0.6.3,<0.7",
"httpx>=0.28.1,<0.29",
"cryptography>=45.0.0,<47",
"subby",
"aiohttp-swagger3>=0.9.0,<1",
"pysubs2>=1.7.0,<2",
"PyExecJS>=1.5.1,<2",
]
[project.urls]
Homepage = "https://github.com/unshackle-dl/unshackle"
Repository = "https://github.com/unshackle-dl/unshackle"
Issues = "https://github.com/unshackle-dl/unshackle/issues"
Discussions = "https://github.com/unshackle-dl/unshackle/discussions"
Changelog = "https://github.com/unshackle-dl/unshackle/blob/master/CHANGELOG.md"
[project.scripts]
unshackle = "unshackle.core.__main__:main"
[dependency-groups]
dev = [
"pre-commit>=3.7.0,<5",
"mypy>=1.9.0,<2",
"mypy-protobuf>=3.6.0,<4",
"types-protobuf>=4.24.0.20240408,<7",
"types-PyMySQL>=1.1.0.1,<2",
"types-requests>=2.31.0.20240406,<3",
"isort>=5.13.2,<8",
"ruff>=0.3.7,<0.15",
"unshackle",
]
[tool.hatch.build.targets.wheel]
packages = ["unshackle"]
[tool.hatch.build.targets.sdist]
include = [
"CHANGELOG.md",
"README.md",
"LICENSE",
]
[tool.ruff]
force-exclude = true
line-length = 120
[tool.ruff.lint]
select = ["E4", "E7", "E9", "F", "W"]
[tool.isort]
line_length = 118
[tool.mypy]
check_untyped_defs = true
disallow_incomplete_defs = true
disallow_untyped_defs = true
follow_imports = "silent"
ignore_missing_imports = true
no_implicit_optional = true
[tool.uv.sources]
unshackle = { workspace = true }
subby = { git = "https://github.com/vevv/subby.git", rev = "5a925c367ffb3f5e53fd114ae222d3be1fdff35d" }

4
unshackle/__main__.py Normal file
View File

@ -0,0 +1,4 @@
if __name__ == "__main__":
from unshackle.core.__main__ import main
main()

View File

View File

90
unshackle/commands/cfg.py Normal file
View File

@ -0,0 +1,90 @@
import ast
import logging
import sys
import click
from ruamel.yaml import YAML
from unshackle.core.config import config, get_config_path
from unshackle.core.constants import context_settings
@click.command(
short_help="Manage configuration values for the program and its services.", context_settings=context_settings
)
@click.argument("key", type=str, required=False)
@click.argument("value", type=str, required=False)
@click.option("--unset", is_flag=True, default=False, help="Unset/remove the configuration value.")
@click.option("--list", "list_", is_flag=True, default=False, help="List all set configuration values.")
@click.pass_context
def cfg(ctx: click.Context, key: str, value: str, unset: bool, list_: bool) -> None:
"""
Manage configuration values for the program and its services.
\b
Known Issues:
- Config changes remove all comments of the changed files, which may hold critical data. (#14)
"""
if not key and not value and not list_:
raise click.UsageError("Nothing to do.", ctx)
if value:
try:
value = ast.literal_eval(value)
except (ValueError, SyntaxError):
pass # probably a str without quotes or similar, assume it's a string value
log = logging.getLogger("cfg")
yaml, data = YAML(), None
yaml.default_flow_style = False
config_path = get_config_path() or config.directories.user_configs / config.filenames.root_config
if config_path.exists():
data = yaml.load(config_path)
if not data:
log.warning("No config file was found or it has no data, yet")
# yaml.load() returns `None` if the input data is blank instead of a usable object
# force a usable object by making one and removing the only item within it
data = yaml.load("""__TEMP__: null""")
del data["__TEMP__"]
if list_:
yaml.dump(data, sys.stdout)
return
key_items = key.split(".")
parent_key = key_items[:-1]
trailing_key = key_items[-1]
is_write = value is not None
is_delete = unset
if is_write and is_delete:
raise click.ClickException("You cannot set a value and use --unset at the same time.")
if not is_write and not is_delete:
data = data.mlget(key_items, default=KeyError)
if data is KeyError:
raise click.ClickException(f"Key '{key}' does not exist in the config.")
yaml.dump(data, sys.stdout)
else:
try:
parent_data = data
if parent_key:
parent_data = data.mlget(parent_key, default=data)
if parent_data == data:
for key in parent_key:
if not hasattr(parent_data, key):
parent_data[key] = {}
parent_data = parent_data[key]
if is_write:
parent_data[trailing_key] = value
log.info(f"Set {key} to {repr(value)}")
elif is_delete:
del parent_data[trailing_key]
log.info(f"Unset {key}")
except KeyError:
raise click.ClickException(f"Key '{key}' does not exist in the config.")
config_path.parent.mkdir(parents=True, exist_ok=True)
yaml.dump(data, config_path)

2422
unshackle/commands/dl.py Normal file

File diff suppressed because it is too large Load Diff

237
unshackle/commands/env.py Normal file
View File

@ -0,0 +1,237 @@
import logging
import os
import shutil
import sys
from pathlib import Path
from typing import Optional
import click
from rich.padding import Padding
from rich.table import Table
from rich.tree import Tree
from unshackle.core import binaries
from unshackle.core.config import POSSIBLE_CONFIG_PATHS, config, config_path
from unshackle.core.console import console
from unshackle.core.constants import context_settings
from unshackle.core.services import Services
@click.group(short_help="Manage and configure the project environment.", context_settings=context_settings)
def env() -> None:
"""Manage and configure the project environment."""
@env.command()
def check() -> None:
"""Checks environment for the required dependencies."""
# Define all dependencies
all_deps = [
# Core Media Tools
{"name": "FFmpeg", "binary": binaries.FFMPEG, "required": True, "desc": "Media processing", "cat": "Core"},
{"name": "FFprobe", "binary": binaries.FFProbe, "required": True, "desc": "Media analysis", "cat": "Core"},
{"name": "MKVToolNix", "binary": binaries.MKVToolNix, "required": True, "desc": "MKV muxing", "cat": "Core"},
{
"name": "mkvpropedit",
"binary": binaries.Mkvpropedit,
"required": True,
"desc": "MKV metadata",
"cat": "Core",
},
{
"name": "shaka-packager",
"binary": binaries.ShakaPackager,
"required": True,
"desc": "DRM decryption",
"cat": "DRM",
},
{
"name": "mp4decrypt",
"binary": binaries.Mp4decrypt,
"required": False,
"desc": "DRM decryption",
"cat": "DRM",
},
# HDR Processing
{"name": "dovi_tool", "binary": binaries.DoviTool, "required": False, "desc": "Dolby Vision", "cat": "HDR"},
{
"name": "HDR10Plus_tool",
"binary": binaries.HDR10PlusTool,
"required": False,
"desc": "HDR10+ metadata",
"cat": "HDR",
},
# Downloaders
{"name": "aria2c", "binary": binaries.Aria2, "required": False, "desc": "Multi-thread DL", "cat": "Download"},
{
"name": "N_m3u8DL-RE",
"binary": binaries.N_m3u8DL_RE,
"required": False,
"desc": "HLS/DASH/ISM",
"cat": "Download",
},
# Subtitle Tools
{
"name": "SubtitleEdit",
"binary": binaries.SubtitleEdit,
"required": False,
"desc": "Sub conversion",
"cat": "Subtitle",
},
{
"name": "CCExtractor",
"binary": binaries.CCExtractor,
"required": False,
"desc": "CC extraction",
"cat": "Subtitle",
},
# Media Players
{"name": "FFplay", "binary": binaries.FFPlay, "required": False, "desc": "Simple player", "cat": "Player"},
{"name": "MPV", "binary": binaries.MPV, "required": False, "desc": "Advanced player", "cat": "Player"},
# Network Tools
{
"name": "HolaProxy",
"binary": binaries.HolaProxy,
"required": False,
"desc": "Proxy service",
"cat": "Network",
},
{"name": "Caddy", "binary": binaries.Caddy, "required": False, "desc": "Web server", "cat": "Network"},
]
# Track overall status
all_required_installed = True
total_installed = 0
total_required = 0
missing_required = []
# Create a single table
table = Table(
title="Environment Dependencies", title_style="bold", show_header=True, header_style="bold", expand=False
)
table.add_column("Category", style="bold cyan", width=10)
table.add_column("Tool", width=16)
table.add_column("Status", justify="center", width=10)
table.add_column("Req", justify="center", width=4)
table.add_column("Purpose", style="bright_black", width=20)
last_cat = None
for dep in all_deps:
path = dep["binary"]
# Category column (only show when it changes)
category = dep["cat"] if dep["cat"] != last_cat else ""
last_cat = dep["cat"]
# Status
if path:
status = "[green]✓[/green]"
total_installed += 1
else:
status = "[red]✗[/red]"
if dep["required"]:
all_required_installed = False
missing_required.append(dep["name"])
if dep["required"]:
total_required += 1
# Required column (compact)
req = "[red]Y[/red]" if dep["required"] else "[bright_black]-[/bright_black]"
# Add row
table.add_row(category, dep["name"], status, req, dep["desc"])
console.print(Padding(table, (1, 2)))
# Compact summary
summary_parts = [f"[bold]Total:[/bold] {total_installed}/{len(all_deps)}"]
if all_required_installed:
summary_parts.append("[green]All required tools installed ✓[/green]")
else:
summary_parts.append(f"[red]Missing required: {', '.join(missing_required)}[/red]")
console.print(Padding(" ".join(summary_parts), (1, 2)))
@env.command()
def info() -> None:
"""Displays information about the current environment."""
log = logging.getLogger("env")
if config_path:
log.info(f"Config loaded from {config_path}")
else:
tree = Tree("No config file found, you can use any of the following locations:")
for i, path in enumerate(POSSIBLE_CONFIG_PATHS, start=1):
tree.add(f"[repr.number]{i}.[/] [text2]{path.resolve()}[/]")
console.print(Padding(tree, (0, 5)))
table = Table(title="Directories", title_style="bold", expand=True)
table.add_column("Name", no_wrap=True)
table.add_column("Path", no_wrap=False, overflow="fold")
path_vars = {
x: Path(os.getenv(x))
for x in ("TEMP", "APPDATA", "LOCALAPPDATA", "USERPROFILE")
if sys.platform == "win32" and os.getenv(x)
}
for name in sorted(dir(config.directories)):
if name.startswith("__") or name == "app_dirs":
continue
attr_value = getattr(config.directories, name)
# Handle both single Path objects and lists of Path objects
if isinstance(attr_value, list):
# For lists, show each path on a separate line
paths_str = "\n".join(str(path.resolve()) for path in attr_value)
table.add_row(name.title(), paths_str)
else:
# For single Path objects, use the original logic
path = attr_value.resolve()
for var, var_path in path_vars.items():
if path.is_relative_to(var_path):
path = rf"%{var}%\{path.relative_to(var_path)}"
break
table.add_row(name.title(), str(path))
console.print(Padding(table, (1, 5)))
@env.group(name="clear", short_help="Clear an environment directory.", context_settings=context_settings)
def clear() -> None:
"""Clear an environment directory."""
@clear.command()
@click.argument("service", type=str, required=False)
def cache(service: Optional[str]) -> None:
"""Clear the environment cache directory."""
log = logging.getLogger("env")
cache_dir = config.directories.cache
if service:
cache_dir = cache_dir / Services.get_tag(service)
log.info(f"Clearing cache directory: {cache_dir}")
files_count = len(list(cache_dir.glob("**/*")))
if not files_count:
log.info("No files to delete")
else:
log.info(f"Deleting {files_count} files...")
shutil.rmtree(cache_dir)
log.info("Cleared")
@clear.command()
def temp() -> None:
"""Clear the environment temp directory."""
log = logging.getLogger("env")
log.info(f"Clearing temp directory: {config.directories.temp}")
files_count = len(list(config.directories.temp.glob("**/*")))
if not files_count:
log.info("No files to delete")
else:
log.info(f"Deleting {files_count} files...")
shutil.rmtree(config.directories.temp)
log.info("Cleared")

214
unshackle/commands/kv.py Normal file
View File

@ -0,0 +1,214 @@
import logging
import re
from pathlib import Path
from typing import Optional
import click
from unshackle.core.config import config
from unshackle.core.constants import context_settings
from unshackle.core.services import Services
from unshackle.core.vault import Vault
from unshackle.core.vaults import Vaults
def load_vaults(vault_names: list[str]) -> Vaults:
"""Load and validate vaults by name."""
vaults = Vaults()
for vault_name in vault_names:
vault_config = next((x for x in config.key_vaults if x["name"] == vault_name), None)
if not vault_config:
raise click.ClickException(f"Vault ({vault_name}) is not defined in the config.")
vault_type = vault_config["type"]
vault_args = vault_config.copy()
del vault_args["type"]
if not vaults.load(vault_type, **vault_args):
raise click.ClickException(f"Failed to load vault ({vault_name}).")
return vaults
def process_service_keys(from_vault: Vault, service: str, log: logging.Logger) -> dict[str, str]:
"""Get and validate keys from a vault for a specific service."""
content_keys = list(from_vault.get_keys(service))
bad_keys = {kid: key for kid, key in content_keys if not key or key.count("0") == len(key)}
for kid, key in bad_keys.items():
log.warning(f"Skipping NULL key: {kid}:{key}")
return {kid: key for kid, key in content_keys if kid not in bad_keys}
def copy_service_data(to_vault: Vault, from_vault: Vault, service: str, log: logging.Logger) -> int:
"""Copy data for a single service between vaults."""
content_keys = process_service_keys(from_vault, service, log)
total_count = len(content_keys)
if total_count == 0:
log.info(f"{service}: No keys found in {from_vault}")
return 0
try:
added = to_vault.add_keys(service, content_keys)
except PermissionError:
log.warning(f"{service}: No permission to create table in {to_vault}, skipped")
return 0
existed = total_count - added
if added > 0 and existed > 0:
log.info(f"{service}: {added} added, {existed} skipped ({total_count} total)")
elif added > 0:
log.info(f"{service}: {added} added ({total_count} total)")
else:
log.info(f"{service}: {existed} skipped (all existed)")
return added
@click.group(short_help="Manage and configure Key Vaults.", context_settings=context_settings)
def kv() -> None:
"""Manage and configure Key Vaults."""
@kv.command()
@click.argument("to_vault_name", type=str)
@click.argument("from_vault_names", nargs=-1, type=click.UNPROCESSED)
@click.option("-s", "--service", type=str, default=None, help="Only copy data to and from a specific service.")
def copy(to_vault_name: str, from_vault_names: list[str], service: Optional[str] = None) -> None:
"""
Copy data from multiple Key Vaults into a single Key Vault.
Rows with matching KIDs are skipped unless there's no KEY set.
Existing data is not deleted or altered.
The `to_vault_name` argument is the key vault you wish to copy data to.
It should be the name of a Key Vault defined in the config.
The `from_vault_names` argument is the key vault(s) you wish to take
data from. You may supply multiple key vaults.
"""
if not from_vault_names:
raise click.ClickException("No Vaults were specified to copy data from.")
log = logging.getLogger("kv")
all_vault_names = [to_vault_name] + list(from_vault_names)
vaults = load_vaults(all_vault_names)
to_vault = vaults.vaults[0]
from_vaults = vaults.vaults[1:]
vault_names = ", ".join([v.name for v in from_vaults])
log.info(f"Copying data from {vault_names}{to_vault.name}")
if service:
service = Services.get_tag(service)
log.info(f"Filtering by service: {service}")
total_added = 0
for from_vault in from_vaults:
services_to_copy = [service] if service else from_vault.get_services()
for service_tag in services_to_copy:
added = copy_service_data(to_vault, from_vault, service_tag, log)
total_added += added
if total_added > 0:
log.info(f"Successfully added {total_added} new keys to {to_vault}")
else:
log.info("Copy completed - no new keys to add")
@kv.command()
@click.argument("vaults", nargs=-1, type=click.UNPROCESSED)
@click.option("-s", "--service", type=str, default=None, help="Only sync data to and from a specific service.")
@click.pass_context
def sync(ctx: click.Context, vaults: list[str], service: Optional[str] = None) -> None:
"""
Ensure multiple Key Vaults copies of all keys as each other.
It's essentially just a bi-way copy between each vault.
To see the precise details of what it's doing between each
provided vault, see the documentation for the `copy` command.
"""
if not len(vaults) > 1:
raise click.ClickException("You must provide more than one Vault to sync.")
ctx.invoke(copy, to_vault_name=vaults[0], from_vault_names=vaults[1:], service=service)
for i in range(1, len(vaults)):
ctx.invoke(copy, to_vault_name=vaults[i], from_vault_names=[vaults[i - 1]], service=service)
@kv.command()
@click.argument("file", type=Path)
@click.argument("service", type=str)
@click.argument("vaults", nargs=-1, type=click.UNPROCESSED)
def add(file: Path, service: str, vaults: list[str]) -> None:
"""
Add new Content Keys to Key Vault(s) by service.
File should contain one key per line in the format KID:KEY (HEX:HEX).
Each line should have nothing else within it except for the KID:KEY.
Encoding is presumed to be UTF8.
"""
if not file.exists():
raise click.ClickException(f"File provided ({file}) does not exist.")
if not file.is_file():
raise click.ClickException(f"File provided ({file}) is not a file.")
if not service or not isinstance(service, str):
raise click.ClickException(f"Service provided ({service}) is invalid.")
if len(vaults) < 1:
raise click.ClickException("You must provide at least one Vault.")
log = logging.getLogger("kv")
service = Services.get_tag(service)
vaults_ = load_vaults(list(vaults))
data = file.read_text(encoding="utf8")
kid_keys: dict[str, str] = {}
for line in data.splitlines(keepends=False):
line = line.strip()
match = re.search(r"^(?P<kid>[0-9a-fA-F]{32}):(?P<key>[0-9a-fA-F]{32})$", line)
if not match:
continue
kid = match.group("kid").lower()
key = match.group("key").lower()
kid_keys[kid] = key
total_count = len(kid_keys)
for vault in vaults_:
log.info(f"Adding {total_count} Content Keys to {vault}")
added_count = vault.add_keys(service, kid_keys)
existed_count = total_count - added_count
log.info(f"{vault}: {added_count} newly added, {existed_count} already existed (skipped)")
log.info("Done!")
@kv.command()
@click.argument("vaults", nargs=-1, type=click.UNPROCESSED)
def prepare(vaults: list[str]) -> None:
"""Create Service Tables on Vaults if not yet created."""
log = logging.getLogger("kv")
vaults_ = load_vaults(vaults)
for vault in vaults_:
if hasattr(vault, "has_table") and hasattr(vault, "create_table"):
for service_tag in Services.get_tags():
if vault.has_table(service_tag):
log.info(f"{vault} already has a {service_tag} Table")
else:
try:
vault.create_table(service_tag, commit=True)
log.info(f"{vault}: Created {service_tag} Table")
except PermissionError:
log.error(f"{vault} user has no create table permission, skipping...")
continue
else:
log.info(f"{vault} does not use tables, skipping...")
log.info("Done!")

271
unshackle/commands/prd.py Normal file
View File

@ -0,0 +1,271 @@
import logging
from pathlib import Path
from typing import Optional
import click
import requests
from Crypto.Random import get_random_bytes
from pyplayready import InvalidCertificateChain, OutdatedDevice
from pyplayready.cdm import Cdm
from pyplayready.crypto.ecc_key import ECCKey
from pyplayready.device import Device
from pyplayready.system.bcert import Certificate, CertificateChain
from pyplayready.system.pssh import PSSH
from unshackle.core.config import config
from unshackle.core.constants import context_settings
@click.group(
short_help="Manage creation of PRD (Playready Device) files.",
context_settings=context_settings,
)
def prd() -> None:
"""Manage creation of PRD (Playready Device) files."""
@prd.command()
@click.argument("paths", type=Path, nargs=-1)
@click.option(
"-e",
"--encryption_key",
type=Path,
required=False,
help="Optional Device ECC private encryption key",
)
@click.option(
"-s",
"--signing_key",
type=Path,
required=False,
help="Optional Device ECC private signing key",
)
@click.option("-o", "--output", type=Path, default=None, help="Output Directory")
@click.pass_context
def new(
ctx: click.Context,
paths: tuple[Path, ...],
encryption_key: Optional[Path],
signing_key: Optional[Path],
output: Optional[Path],
) -> None:
"""Create a new .PRD PlayReady Device file.
Accepts either paths to a group key and certificate or a single directory
containing ``zgpriv.dat`` and ``bgroupcert.dat``.
"""
if len(paths) == 1 and paths[0].is_dir():
device_dir = paths[0]
group_key = device_dir / "zgpriv.dat"
group_certificate = device_dir / "bgroupcert.dat"
if not group_key.is_file() or not group_certificate.is_file():
raise click.UsageError("Folder must contain zgpriv.dat and bgroupcert.dat", ctx)
elif len(paths) == 2:
group_key, group_certificate = paths
if not group_key.is_file():
raise click.UsageError("group_key: Not a path to a file, or it doesn't exist.", ctx)
if not group_certificate.is_file():
raise click.UsageError("group_certificate: Not a path to a file, or it doesn't exist.", ctx)
device_dir = None
else:
raise click.UsageError(
"Provide either a folder path or paths to group_key and group_certificate",
ctx,
)
if encryption_key and not encryption_key.is_file():
raise click.UsageError("encryption_key: Not a path to a file, or it doesn't exist.", ctx)
if signing_key and not signing_key.is_file():
raise click.UsageError("signing_key: Not a path to a file, or it doesn't exist.", ctx)
log = logging.getLogger("prd")
encryption_key_obj = ECCKey.load(encryption_key) if encryption_key else ECCKey.generate()
signing_key_obj = ECCKey.load(signing_key) if signing_key else ECCKey.generate()
group_key_obj = ECCKey.load(group_key)
certificate_chain = CertificateChain.load(group_certificate)
if certificate_chain.get(0).get_issuer_key() != group_key_obj.public_bytes():
raise InvalidCertificateChain("Group key does not match this certificate")
new_certificate = Certificate.new_leaf_cert(
cert_id=get_random_bytes(16),
security_level=certificate_chain.get_security_level(),
client_id=get_random_bytes(16),
signing_key=signing_key_obj,
encryption_key=encryption_key_obj,
group_key=group_key_obj,
parent=certificate_chain,
)
certificate_chain.prepend(new_certificate)
certificate_chain.verify()
device = Device(
group_key=group_key_obj.dumps(),
encryption_key=encryption_key_obj.dumps(),
signing_key=signing_key_obj.dumps(),
group_certificate=certificate_chain.dumps(),
)
if output and output.suffix:
if output.suffix.lower() != ".prd":
log.warning(
"Saving PRD with the file extension '%s' but '.prd' is recommended.",
output.suffix,
)
out_path = output
else:
out_dir = output or (device_dir or config.directories.prds)
out_path = out_dir / f"{device.get_name()}.prd"
if out_path.exists():
log.error("A file already exists at the path '%s', cannot overwrite.", out_path)
return
out_path.parent.mkdir(parents=True, exist_ok=True)
out_path.write_bytes(device.dumps())
log.info("Created Playready Device (.prd) file, %s", out_path.name)
log.info(" + Security Level: %s", device.security_level)
log.info(" + Group Key: %s bytes", len(device.group_key.dumps()))
log.info(" + Encryption Key: %s bytes", len(device.encryption_key.dumps()))
log.info(" + Signing Key: %s bytes", len(device.signing_key.dumps()))
log.info(" + Group Certificate: %s bytes", len(device.group_certificate.dumps()))
log.info(" + Saved to: %s", out_path.absolute())
@prd.command(name="reprovision")
@click.argument("prd_path", type=Path)
@click.option(
"-e",
"--encryption_key",
type=Path,
required=False,
help="Optional Device ECC private encryption key",
)
@click.option(
"-s",
"--signing_key",
type=Path,
required=False,
help="Optional Device ECC private signing key",
)
@click.option("-o", "--output", type=Path, default=None, help="Output Path or Directory")
@click.pass_context
def reprovision_device(
ctx: click.Context,
prd_path: Path,
encryption_key: Optional[Path],
signing_key: Optional[Path],
output: Optional[Path] = None,
) -> None:
"""Reprovision a Playready Device (.prd) file."""
if not prd_path.is_file():
raise click.UsageError("prd_path: Not a path to a file, or it doesn't exist.", ctx)
log = logging.getLogger("prd")
log.info("Reprovisioning Playready Device (.prd) file, %s", prd_path.name)
device = Device.load(prd_path)
if device.group_key is None:
raise OutdatedDevice(
"Device does not support reprovisioning, re-create it or use a Device with a version of 3 or higher"
)
device.group_certificate.remove(0)
encryption_key_obj = ECCKey.load(encryption_key) if encryption_key else ECCKey.generate()
signing_key_obj = ECCKey.load(signing_key) if signing_key else ECCKey.generate()
device.encryption_key = encryption_key_obj
device.signing_key = signing_key_obj
new_certificate = Certificate.new_leaf_cert(
cert_id=get_random_bytes(16),
security_level=device.group_certificate.get_security_level(),
client_id=get_random_bytes(16),
signing_key=signing_key_obj,
encryption_key=encryption_key_obj,
group_key=device.group_key,
parent=device.group_certificate,
)
device.group_certificate.prepend(new_certificate)
if output and output.suffix:
if output.suffix.lower() != ".prd":
log.warning(
"Saving PRD with the file extension '%s' but '.prd' is recommended.",
output.suffix,
)
out_path = output
else:
out_path = prd_path
out_path.parent.mkdir(parents=True, exist_ok=True)
out_path.write_bytes(device.dumps())
log.info("Reprovisioned Playready Device (.prd) file, %s", out_path.name)
@prd.command()
@click.argument("device", type=Path)
@click.option(
"-c",
"--ckt",
type=click.Choice(["aesctr", "aescbc"], case_sensitive=False),
default="aesctr",
help="Content Key Encryption Type",
)
@click.option(
"-sl",
"--security-level",
type=click.Choice(["150", "2000", "3000"], case_sensitive=False),
default="2000",
help="Minimum Security Level",
)
@click.pass_context
def test(
ctx: click.Context,
device: Path,
ckt: str,
security_level: str,
) -> None:
"""Test a Playready Device on the Microsoft demo server."""
if not device.is_file():
raise click.UsageError("device: Not a path to a file, or it doesn't exist.", ctx)
log = logging.getLogger("prd")
prd_device = Device.load(device)
log.info("Loaded Device: %s", prd_device.get_name())
cdm = Cdm.from_device(prd_device)
log.info("Loaded CDM")
session_id = cdm.open()
log.info("Opened Session")
pssh_b64 = "AAADfHBzc2gAAAAAmgTweZhAQoarkuZb4IhflQAAA1xcAwAAAQABAFIDPABXAFIATQBIAEUAQQBEAEUAUgAgAHgAbQBsAG4AcwA9ACIAaAB0AHQAcAA6AC8ALwBzAGMAaABlAG0AYQBzAC4AbQBpAGMAcgBvAHMAbwBmAHQALgBjAG8AbQAvAEQAUgBNAC8AMgAwADAANwAvADAAMwAvAFAAbABhAHkAUgBlAGEAZAB5AEgAZQBhAGQAZQByACIAIAB2AGUAcgBzAGkAbwBuAD0AIgA0AC4AMAAuADAALgAwACIAPgA8AEQAQQBUAEEAPgA8AFAAUgBPAFQARQBDAFQASQBOAEYATwA+ADwASwBFAFkATABFAE4APgAxADYAPAAvAEsARQBZAEwARQBOAD4APABBAEwARwBJAEQAPgBBAEUAUwBDAFQAUgA8AC8AQQBMAEcASQBEAD4APAAvAFAAUgBPAFQARQBDAFQASQBOAEYATwA+ADwASwBJAEQAPgA0AFIAcABsAGIAKwBUAGIATgBFAFMAOAB0AEcAawBOAEYAVwBUAEUASABBAD0APQA8AC8ASwBJAEQAPgA8AEMASABFAEMASwBTAFUATQA+AEsATABqADMAUQB6AFEAUAAvAE4AQQA9ADwALwBDAEgARQBDAEsAUwBVAE0APgA8AEwAQQBfAFUAUgBMAD4AaAB0AHQAcABzADoALwAvAHAAcgBvAGYAZgBpAGMAaQBhAGwAcwBpAHQAZQAuAGsAZQB5AGQAZQBsAGkAdgBlAHIAeQAuAG0AZQBkAGkAYQBzAGUAcgB2AGkAYwBlAHMALgB3AGkAbgBkAG8AdwBzAC4AbgBlAHQALwBQAGwAYQB5AFIAZQBhAGQAeQAvADwALwBMAEEAXwBVAFIATAA+ADwAQwBVAFMAVABPAE0AQQBUAFQAUgBJAEIAVQBUAEUAUwA+ADwASQBJAFMAXwBEAFIATQBfAFYARQBSAFMASQBPAE4APgA4AC4AMQAuADIAMwAwADQALgAzADEAPAAvAEkASQBTAF8ARABSAE0AXwBWAEUAUgBTAEkATwBOAD4APAAvAEMAVQBTAFQATwBNAEEAVABUAFIASQBCAFUAVABFAFMAPgA8AC8ARABBAFQAQQA+ADwALwBXAFIATQBIAEUAQQBEAEUAUgA+AA=="
pssh = PSSH(pssh_b64)
challenge = cdm.get_license_challenge(session_id, pssh.wrm_headers[0])
log.info("Created License Request")
license_server = f"https://test.playready.microsoft.com/service/rightsmanager.asmx?cfg=(persist:false,sl:{security_level},ckt:{ckt})"
response = requests.post(
url=license_server,
headers={"Content-Type": "text/xml; charset=UTF-8"},
data=challenge,
)
cdm.parse_license(session_id, response.text)
log.info("License Parsed Successfully")
for key in cdm.get_keys(session_id):
log.info(f"{key.key_id.hex}:{key.key.hex()}")
cdm.close(session_id)
log.info("Closed Session")

View File

@ -0,0 +1,151 @@
from __future__ import annotations
import logging
import re
import sys
from typing import Any, Optional
import click
import yaml
from rich.padding import Padding
from rich.rule import Rule
from rich.tree import Tree
from unshackle.commands.dl import dl
from unshackle.core import binaries
from unshackle.core.config import config
from unshackle.core.console import console
from unshackle.core.constants import context_settings
from unshackle.core.proxies import Basic, Hola, NordVPN, SurfsharkVPN
from unshackle.core.service import Service
from unshackle.core.services import Services
from unshackle.core.utils.click_types import ContextData
from unshackle.core.utils.collections import merge_dict
@click.command(
short_help="Search for titles from a Service.",
cls=Services,
context_settings=dict(**context_settings, token_normalize_func=Services.get_tag),
)
@click.option(
"-p", "--profile", type=str, default=None, help="Profile to use for Credentials and Cookies (if available)."
)
@click.option(
"--proxy",
type=str,
default=None,
help="Proxy URI to use. If a 2-letter country is provided, it will try get a proxy from the config.",
)
@click.option("--no-proxy", is_flag=True, default=False, help="Force disable all proxy use.")
@click.pass_context
def search(ctx: click.Context, no_proxy: bool, profile: Optional[str] = None, proxy: Optional[str] = None):
if not ctx.invoked_subcommand:
raise ValueError("A subcommand to invoke was not specified, the main code cannot continue.")
log = logging.getLogger("search")
service = Services.get_tag(ctx.invoked_subcommand)
profile = profile
if profile:
log.info(f"Using profile: '{profile}'")
with console.status("Loading Service Config...", spinner="dots"):
service_config_path = Services.get_path(service) / config.filenames.config
if service_config_path.exists():
service_config = yaml.safe_load(service_config_path.read_text(encoding="utf8"))
log.info("Service Config loaded")
else:
service_config = {}
merge_dict(config.services.get(service), service_config)
proxy_providers = []
if no_proxy:
ctx.params["proxy"] = None
else:
with console.status("Loading Proxy Providers...", spinner="dots"):
if config.proxy_providers.get("basic"):
proxy_providers.append(Basic(**config.proxy_providers["basic"]))
if config.proxy_providers.get("nordvpn"):
proxy_providers.append(NordVPN(**config.proxy_providers["nordvpn"]))
if config.proxy_providers.get("surfsharkvpn"):
proxy_providers.append(SurfsharkVPN(**config.proxy_providers["surfsharkvpn"]))
if binaries.HolaProxy:
proxy_providers.append(Hola())
for proxy_provider in proxy_providers:
log.info(f"Loaded {proxy_provider.__class__.__name__}: {proxy_provider}")
if proxy:
requested_provider = None
if re.match(r"^[a-z]+:.+$", proxy, re.IGNORECASE):
# requesting proxy from a specific proxy provider
requested_provider, proxy = proxy.split(":", maxsplit=1)
if re.match(r"^[a-z]{2}(?:\d+)?$", proxy, re.IGNORECASE):
proxy = proxy.lower()
with console.status(f"Getting a Proxy to {proxy}...", spinner="dots"):
if requested_provider:
proxy_provider = next(
(x for x in proxy_providers if x.__class__.__name__.lower() == requested_provider), None
)
if not proxy_provider:
log.error(f"The proxy provider '{requested_provider}' was not recognised.")
sys.exit(1)
proxy_uri = proxy_provider.get_proxy(proxy)
if not proxy_uri:
log.error(f"The proxy provider {requested_provider} had no proxy for {proxy}")
sys.exit(1)
proxy = ctx.params["proxy"] = proxy_uri
log.info(f"Using {proxy_provider.__class__.__name__} Proxy: {proxy}")
else:
for proxy_provider in proxy_providers:
proxy_uri = proxy_provider.get_proxy(proxy)
if proxy_uri:
proxy = ctx.params["proxy"] = proxy_uri
log.info(f"Using {proxy_provider.__class__.__name__} Proxy: {proxy}")
break
else:
log.info(f"Using explicit Proxy: {proxy}")
ctx.obj = ContextData(config=service_config, cdm=None, proxy_providers=proxy_providers, profile=profile)
@search.result_callback()
def result(service: Service, profile: Optional[str] = None, **_: Any) -> None:
log = logging.getLogger("search")
service_tag = service.__class__.__name__
with console.status("Authenticating with Service...", spinner="dots"):
cookies = dl.get_cookie_jar(service_tag, profile)
credential = dl.get_credentials(service_tag, profile)
service.authenticate(cookies, credential)
if cookies or credential:
log.info("Authenticated with Service")
search_results = Tree("Search Results", hide_root=True)
with console.status("Searching...", spinner="dots"):
for result in service.search():
result_text = f"[bold text]{result.title}[/]"
if result.url:
result_text = f"[link={result.url}]{result_text}[/link]"
if result.label:
result_text += f" [pink]{result.label}[/]"
if result.description:
result_text += f"\n[text2]{result.description}[/]"
result_text += f"\n[bright_black]id: {result.id}[/]"
search_results.add(result_text + "\n")
# update cookies
cookie_file = dl.get_cookie_path(service_tag, profile)
if cookie_file:
dl.save_cookies(cookie_file, service.session.cookies)
console.print(Padding(Rule(f"[rule.text]{len(search_results.children)} Search Results"), (1, 2)))
if search_results.children:
console.print(Padding(search_results, (0, 5)))
else:
console.print(
Padding("[bold text]No matches[/]\n[bright_black]Please check spelling and search again....[/]", (0, 5))
)

124
unshackle/commands/serve.py Normal file
View File

@ -0,0 +1,124 @@
import logging
import subprocess
import click
from aiohttp import web
from unshackle.core import binaries
from unshackle.core.api import cors_middleware, setup_routes, setup_swagger
from unshackle.core.config import config
from unshackle.core.constants import context_settings
@click.command(
short_help="Serve your Local Widevine Devices and REST API for Remote Access.", context_settings=context_settings
)
@click.option("-h", "--host", type=str, default="0.0.0.0", help="Host to serve from.")
@click.option("-p", "--port", type=int, default=8786, help="Port to serve from.")
@click.option("--caddy", is_flag=True, default=False, help="Also serve with Caddy.")
@click.option("--api-only", is_flag=True, default=False, help="Serve only the REST API, not pywidevine CDM.")
@click.option("--no-key", is_flag=True, default=False, help="Disable API key authentication (allows all requests).")
@click.option(
"--debug-api",
is_flag=True,
default=False,
help="Include technical debug information (tracebacks, stderr) in API error responses.",
)
def serve(host: str, port: int, caddy: bool, api_only: bool, no_key: bool, debug_api: bool) -> None:
"""
Serve your Local Widevine Devices and REST API for Remote Access.
\b
Host as 127.0.0.1 may block remote access even if port-forwarded.
Instead, use 0.0.0.0 and ensure the TCP port you choose is forwarded.
\b
You may serve with Caddy at the same time with --caddy. You can use Caddy
as a reverse-proxy to serve with HTTPS. The config used will be the Caddyfile
next to the unshackle config.
\b
The REST API provides programmatic access to unshackle functionality.
Configure authentication in your config under serve.users and serve.api_secret.
"""
from pywidevine import serve as pywidevine_serve
log = logging.getLogger("serve")
# Validate API secret for REST API routes (unless --no-key is used)
if not no_key:
api_secret = config.serve.get("api_secret")
if not api_secret:
raise click.ClickException(
"API secret key is not configured. Please add 'api_secret' to the 'serve' section in your config."
)
else:
api_secret = None
log.warning("Running with --no-key: Authentication is DISABLED for all API endpoints!")
if debug_api:
log.warning("Running with --debug-api: Error responses will include technical debug information!")
if caddy:
if not binaries.Caddy:
raise click.ClickException('Caddy executable "caddy" not found but is required for --caddy.')
caddy_p = subprocess.Popen(
[binaries.Caddy, "run", "--config", str(config.directories.user_configs / "Caddyfile")]
)
else:
caddy_p = None
try:
if not config.serve.get("devices"):
config.serve["devices"] = []
config.serve["devices"].extend(list(config.directories.wvds.glob("*.wvd")))
if api_only:
# API-only mode: serve just the REST API
log.info("Starting REST API server (pywidevine CDM disabled)")
if no_key:
app = web.Application(middlewares=[cors_middleware])
app["config"] = {"users": []}
else:
app = web.Application(middlewares=[cors_middleware, pywidevine_serve.authentication])
app["config"] = {"users": [api_secret]}
app["debug_api"] = debug_api
setup_routes(app)
setup_swagger(app)
log.info(f"REST API endpoints available at http://{host}:{port}/api/")
log.info(f"Swagger UI available at http://{host}:{port}/api/docs/")
log.info("(Press CTRL+C to quit)")
web.run_app(app, host=host, port=port, print=None)
else:
# Integrated mode: serve both pywidevine + REST API
log.info("Starting integrated server (pywidevine CDM + REST API)")
# Create integrated app with both pywidevine and API routes
if no_key:
app = web.Application(middlewares=[cors_middleware])
app["config"] = dict(config.serve)
app["config"]["users"] = []
else:
app = web.Application(middlewares=[cors_middleware, pywidevine_serve.authentication])
# Setup config - add API secret to users for authentication
serve_config = dict(config.serve)
if not serve_config.get("users"):
serve_config["users"] = []
if api_secret not in serve_config["users"]:
serve_config["users"].append(api_secret)
app["config"] = serve_config
app.on_startup.append(pywidevine_serve._startup)
app.on_cleanup.append(pywidevine_serve._cleanup)
app.add_routes(pywidevine_serve.routes)
app["debug_api"] = debug_api
setup_routes(app)
setup_swagger(app)
log.info(f"REST API endpoints available at http://{host}:{port}/api/")
log.info(f"Swagger UI available at http://{host}:{port}/api/docs/")
log.info("(Press CTRL+C to quit)")
web.run_app(app, host=host, port=port, print=None)
finally:
if caddy_p:
caddy_p.kill()

267
unshackle/commands/util.py Normal file
View File

@ -0,0 +1,267 @@
import subprocess
from pathlib import Path
import click
from pymediainfo import MediaInfo
from unshackle.core import binaries
from unshackle.core.constants import context_settings
@click.group(short_help="Various helper scripts and programs.", context_settings=context_settings)
def util() -> None:
"""Various helper scripts and programs."""
@util.command()
@click.argument("path", type=Path)
@click.argument("aspect", type=str)
@click.option(
"--letter/--pillar",
default=True,
help="Specify which direction to crop. Top and Bottom would be --letter, Sides would be --pillar.",
)
@click.option("-o", "--offset", type=int, default=0, help="Fine tune the computed crop area if not perfectly centered.")
@click.option(
"-p",
"--preview",
is_flag=True,
default=False,
help="Instantly preview the newly-set aspect crop in MPV (or ffplay if mpv is unavailable).",
)
def crop(path: Path, aspect: str, letter: bool, offset: int, preview: bool) -> None:
"""
Losslessly crop H.264 and H.265 video files at the bit-stream level.
You may provide a path to a file, or a folder of mkv and/or mp4 files.
Note: If you notice that the values you put in are not quite working, try
tune -o/--offset. This may be necessary on videos with sub-sampled chroma.
Do note that you may not get an ideal lossless cropping result on some
cases, again due to sub-sampled chroma.
It's recommended that you try -o about 10 or so pixels and lower it until
you get as close in as possible. Do make sure it's not over-cropping either
as it may go from being 2px away from a perfect crop, to 20px over-cropping
again due to sub-sampled chroma.
"""
if not binaries.FFMPEG:
raise click.ClickException('FFmpeg executable "ffmpeg" not found but is required.')
if path.is_dir():
paths = list(path.glob("*.mkv")) + list(path.glob("*.mp4"))
else:
paths = [path]
for video_path in paths:
try:
video_track = next(iter(MediaInfo.parse(video_path).video_tracks or []))
except StopIteration:
raise click.ClickException("There's no video tracks in the provided file.")
crop_filter = {"HEVC": "hevc_metadata", "AVC": "h264_metadata"}.get(video_track.commercial_name)
if not crop_filter:
raise click.ClickException(f"{video_track.commercial_name} Codec not supported.")
aspect_w, aspect_h = list(map(float, aspect.split(":")))
if letter:
crop_value = (video_track.height - (video_track.width / (aspect_w * aspect_h))) / 2
left, top, right, bottom = map(int, [0, crop_value + offset, 0, crop_value - offset])
else:
crop_value = (video_track.width - (video_track.height * (aspect_w / aspect_h))) / 2
left, top, right, bottom = map(int, [crop_value + offset, 0, crop_value - offset, 0])
crop_filter += f"=crop_left={left}:crop_top={top}:crop_right={right}:crop_bottom={bottom}"
if min(left, top, right, bottom) < 0:
raise click.ClickException("Cannot crop less than 0, are you cropping in the right direction?")
if preview:
out_path = ["-f", "mpegts", "-"] # pipe
else:
out_path = [
str(
video_path.with_name(
".".join(
filter(
bool,
[
video_path.stem,
video_track.language,
"crop",
str(offset or ""),
{
# ffmpeg's MKV muxer does not yet support HDR
"HEVC": "h265",
"AVC": "h264",
}.get(video_track.commercial_name, ".mp4"),
],
)
)
)
)
]
ffmpeg_call = subprocess.Popen(
[binaries.FFMPEG, "-y", "-i", str(video_path), "-map", "0:v:0", "-c", "copy", "-bsf:v", crop_filter]
+ out_path,
stdout=subprocess.PIPE,
)
try:
if preview:
previewer = binaries.MPV or binaries.FFPlay
if not previewer:
raise click.ClickException("MPV/FFplay executables weren't found but are required for previewing.")
subprocess.Popen((previewer, "-"), stdin=ffmpeg_call.stdout)
finally:
if ffmpeg_call.stdout:
ffmpeg_call.stdout.close()
ffmpeg_call.wait()
@util.command(name="range")
@click.argument("path", type=Path)
@click.option("--full/--limited", is_flag=True, help="Full: 0..255, Limited: 16..235 (16..240 YUV luma)")
@click.option(
"-p",
"--preview",
is_flag=True,
default=False,
help="Instantly preview the newly-set video range in MPV (or ffplay if mpv is unavailable).",
)
def range_(path: Path, full: bool, preview: bool) -> None:
"""
Losslessly set the Video Range flag to full or limited at the bit-stream level.
You may provide a path to a file, or a folder of mkv and/or mp4 files.
If you ever notice blacks not being quite black, and whites not being quite white,
then you're video may have the range set to the wrong value. Flip its range to the
opposite value and see if that fixes it.
"""
if not binaries.FFMPEG:
raise click.ClickException('FFmpeg executable "ffmpeg" not found but is required.')
if path.is_dir():
paths = list(path.glob("*.mkv")) + list(path.glob("*.mp4"))
else:
paths = [path]
for video_path in paths:
try:
video_track = next(iter(MediaInfo.parse(video_path).video_tracks or []))
except StopIteration:
raise click.ClickException("There's no video tracks in the provided file.")
metadata_key = {"HEVC": "hevc_metadata", "AVC": "h264_metadata"}.get(video_track.commercial_name)
if not metadata_key:
raise click.ClickException(f"{video_track.commercial_name} Codec not supported.")
if preview:
out_path = ["-f", "mpegts", "-"] # pipe
else:
out_path = [
str(
video_path.with_name(
".".join(
filter(
bool,
[
video_path.stem,
video_track.language,
"range",
["limited", "full"][full],
{
# ffmpeg's MKV muxer does not yet support HDR
"HEVC": "h265",
"AVC": "h264",
}.get(video_track.commercial_name, ".mp4"),
],
)
)
)
)
]
ffmpeg_call = subprocess.Popen(
[
binaries.FFMPEG,
"-y",
"-i",
str(video_path),
"-map",
"0:v:0",
"-c",
"copy",
"-bsf:v",
f"{metadata_key}=video_full_range_flag={int(full)}",
]
+ out_path,
stdout=subprocess.PIPE,
)
try:
if preview:
previewer = binaries.MPV or binaries.FFPlay
if not previewer:
raise click.ClickException("MPV/FFplay executables weren't found but are required for previewing.")
subprocess.Popen((previewer, "-"), stdin=ffmpeg_call.stdout)
finally:
if ffmpeg_call.stdout:
ffmpeg_call.stdout.close()
ffmpeg_call.wait()
@util.command()
@click.argument("path", type=Path)
@click.option(
"-m", "--map", "map_", type=str, default="0", help="Test specific streams by setting FFmpeg's -map parameter."
)
def test(path: Path, map_: str) -> None:
"""
Decode an entire video and check for any corruptions or errors using FFmpeg.
You may provide a path to a file, or a folder of mkv and/or mp4 files.
Tests all streams within the file by default. Subtitles cannot be tested.
You may choose specific streams using the -m/--map parameter. E.g.,
'0:v:0' to test the first video stream, or '0:a' to test all audio streams.
"""
if not binaries.FFMPEG:
raise click.ClickException('FFmpeg executable "ffmpeg" not found but is required.')
if path.is_dir():
paths = list(path.glob("*.mkv")) + list(path.glob("*.mp4"))
else:
paths = [path]
for video_path in paths:
print("Starting...")
p = subprocess.Popen(
[
binaries.FFMPEG,
"-hide_banner",
"-benchmark",
"-i",
str(video_path),
"-map",
map_,
"-sn",
"-f",
"null",
"-",
],
stderr=subprocess.PIPE,
universal_newlines=True,
)
reached_output = False
errors = 0
for line in p.stderr:
line = line.strip()
if "speed=" in line:
reached_output = True
if not reached_output:
continue
if line.startswith("["): # error of some kind
errors += 1
stream, error = line.split("] ", maxsplit=1)
stream = stream.split(" @ ")[0]
line = f"{stream} ERROR: {error}"
print(line)
p.stderr.close()
print(f"Finished with {errors} Errors, Cleaning up...")
p.terminate()
p.wait()

272
unshackle/commands/wvd.py Normal file
View File

@ -0,0 +1,272 @@
import logging
import shutil
from pathlib import Path
from typing import Optional
import click
import yaml
from google.protobuf.json_format import MessageToDict
from pywidevine.device import Device, DeviceTypes
from pywidevine.license_protocol_pb2 import FileHashes
from rich.prompt import Prompt
from unidecode import UnidecodeError, unidecode
from unshackle.core.config import config
from unshackle.core.console import console
from unshackle.core.constants import context_settings
@click.group(
short_help="Manage configuration and creation of WVD (Widevine Device) files.", context_settings=context_settings
)
def wvd() -> None:
"""Manage configuration and creation of WVD (Widevine Device) files."""
@wvd.command()
@click.argument("paths", type=Path, nargs=-1)
def add(paths: list[Path]) -> None:
"""Add one or more WVD (Widevine Device) files to the WVDs Directory."""
log = logging.getLogger("wvd")
for path in paths:
dst_path = config.directories.wvds / path.name
if not path.exists():
log.error(f"The WVD path '{path}' does not exist...")
elif dst_path.exists():
log.error(f"WVD named '{path.stem}' already exists...")
else:
# TODO: Check for and log errors
_ = Device.load(path) # test if WVD is valid
dst_path.parent.mkdir(parents=True, exist_ok=True)
shutil.move(path, dst_path)
log.info(f"Added {path.stem}")
@wvd.command()
@click.argument("names", type=str, nargs=-1)
def delete(names: list[str]) -> None:
"""Delete one or more WVD (Widevine Device) files from the WVDs Directory."""
log = logging.getLogger("wvd")
for name in names:
path = (config.directories.wvds / name).with_suffix(".wvd")
if not path.exists():
log.error(f"No WVD file exists by the name '{name}'...")
continue
answer = Prompt.ask(
f"[red]Deleting '{name}'[/], are you sure you want to continue?",
choices=["y", "n"],
default="n",
console=console,
)
if answer == "n":
log.info("Aborting...")
continue
Path.unlink(path)
log.info(f"Deleted {name}")
@wvd.command()
@click.argument("path", type=Path)
def parse(path: Path) -> None:
"""
Parse a .WVD Widevine Device file to check information.
Relative paths are relative to the WVDs directory.
"""
try:
named = not path.suffix and path.relative_to(Path(""))
except ValueError:
named = False
if named:
path = config.directories.wvds / f"{path.name}.wvd"
log = logging.getLogger("wvd")
if not path.exists():
console.log(f"[bright_blue]{path.absolute()}[/] does not exist...")
return
device = Device.load(path)
log.info(f"System ID: {device.system_id}")
log.info(f"Security Level: {device.security_level}")
log.info(f"Type: {device.type}")
log.info(f"Flags: {device.flags}")
log.info(f"Private Key: {bool(device.private_key)}")
log.info(f"Client ID: {bool(device.client_id)}")
log.info(f"VMP: {bool(device.client_id.vmp_data)}")
log.info("Client ID:")
log.info(device.client_id)
log.info("VMP:")
if device.client_id.vmp_data:
file_hashes = FileHashes()
file_hashes.ParseFromString(device.client_id.vmp_data)
log.info(str(file_hashes))
else:
log.info("None")
@wvd.command()
@click.argument("wvd_paths", type=Path, nargs=-1)
@click.argument("out_dir", type=Path, nargs=1)
def dump(wvd_paths: list[Path], out_dir: Path) -> None:
"""
Extract data from a .WVD Widevine Device file to a folder structure.
If the path is relative, with no file extension, it will dump the WVD in the WVDs
directory.
"""
log = logging.getLogger("wvd")
if wvd_paths == ():
if not config.directories.wvds.exists():
console.log(f"[bright_blue]{config.directories.wvds.absolute()}[/] does not exist...")
wvd_paths = list(x for x in config.directories.wvds.iterdir() if x.is_file() and x.suffix.lower() == ".wvd")
if not wvd_paths:
console.log(f"[bright_blue]{config.directories.wvds.absolute()}[/] is empty...")
for i, (wvd_path, out_path) in enumerate(zip(wvd_paths, (out_dir / x.stem for x in wvd_paths))):
if i > 0:
log.info("")
try:
named = not wvd_path.suffix and wvd_path.relative_to(Path(""))
except ValueError:
named = False
if named:
wvd_path = config.directories.wvds / f"{wvd_path.stem}.wvd"
out_path.mkdir(parents=True, exist_ok=True)
log.info(f"Dumping: {wvd_path}")
device = Device.load(wvd_path)
log.info(f"L{device.security_level} {device.system_id} {device.type.name}")
log.info(f"Saving to: {out_path}")
device_meta = {
"wvd": {"device_type": device.type.name, "security_level": device.security_level, **device.flags},
"client_info": {},
"capabilities": MessageToDict(device.client_id, preserving_proto_field_name=True)["client_capabilities"],
}
for client_info in device.client_id.client_info:
device_meta["client_info"][client_info.name] = client_info.value
device_meta_path = out_path / "metadata.yml"
device_meta_path.write_text(yaml.dump(device_meta), encoding="utf8")
log.info(" + Device Metadata")
if device.private_key:
private_key_path = out_path / "private_key.pem"
private_key_path.write_text(data=device.private_key.export_key().decode(), encoding="utf8")
private_key_path.with_suffix(".der").write_bytes(device.private_key.export_key(format="DER"))
log.info(" + Private Key")
else:
log.warning(" - No Private Key available")
if device.client_id:
client_id_path = out_path / "client_id.bin"
client_id_path.write_bytes(device.client_id.SerializeToString())
log.info(" + Client ID")
else:
log.warning(" - No Client ID available")
if device.client_id.vmp_data:
vmp_path = out_path / "vmp.bin"
vmp_path.write_bytes(device.client_id.vmp_data)
log.info(" + VMP (File Hashes)")
else:
log.info(" - No VMP (File Hashes) available")
@wvd.command()
@click.argument("name", type=str)
@click.argument("private_key", type=Path)
@click.argument("client_id", type=Path)
@click.argument("file_hashes", type=Path, required=False)
@click.option(
"-t",
"--type",
"type_",
type=click.Choice([x.name for x in DeviceTypes], case_sensitive=False),
default="Android",
help="Device Type",
)
@click.option("-l", "--level", type=click.IntRange(1, 3), default=1, help="Device Security Level")
@click.option("-o", "--output", type=Path, default=None, help="Output Directory")
@click.pass_context
def new(
ctx: click.Context,
name: str,
private_key: Path,
client_id: Path,
file_hashes: Optional[Path],
type_: str,
level: int,
output: Optional[Path],
) -> None:
"""
Create a new .WVD Widevine provision file.
name: The origin device name of the provided data. e.g. `Nexus 6P`. You do not need to
specify the security level, that will be done automatically.
private_key: A PEM file of a Device's private key.
client_id: A binary blob file which follows the Widevine ClientIdentification protobuf
schema.
file_hashes: A binary blob file with follows the Widevine FileHashes protobuf schema.
Also known as VMP as it's used for VMP (Verified Media Path) assurance.
"""
try:
# TODO: Remove need for name, create name based on Client IDs ClientInfo values
name = unidecode(name.strip().lower().replace(" ", "_"))
except UnidecodeError as e:
raise click.UsageError(f"name: Failed to sanitize name, {e}", ctx)
if not name:
raise click.UsageError("name: Empty after sanitizing, please make sure the name is valid.", ctx)
if not private_key.is_file():
raise click.UsageError("private_key: Not a path to a file, or it doesn't exist.", ctx)
if not client_id.is_file():
raise click.UsageError("client_id: Not a path to a file, or it doesn't exist.", ctx)
if file_hashes and not file_hashes.is_file():
raise click.UsageError("file_hashes: Not a path to a file, or it doesn't exist.", ctx)
device = Device(
type_=DeviceTypes[type_.upper()],
security_level=level,
flags=None,
private_key=private_key.read_bytes(),
client_id=client_id.read_bytes(),
)
if file_hashes:
device.client_id.vmp_data = file_hashes.read_bytes()
out_path = (output or config.directories.wvds) / f"{name}_{device.system_id}_l{device.security_level}.wvd"
device.dump(out_path)
log = logging.getLogger("wvd")
log.info(f"Created binary WVD file, {out_path.name}")
log.info(f" + Saved to: {out_path.absolute()}")
log.info(f"System ID: {device.system_id}")
log.info(f"Security Level: {device.security_level}")
log.info(f"Type: {device.type}")
log.info(f"Flags: {device.flags}")
log.info(f"Private Key: {bool(device.private_key)}")
log.info(f"Client ID: {bool(device.client_id)}")
log.info(f"VMP: {bool(device.client_id.vmp_data)}")
log.info("Client ID:")
log.info(device.client_id)
log.info("VMP:")
if device.client_id.vmp_data:
file_hashes = FileHashes()
file_hashes.ParseFromString(device.client_id.vmp_data)
log.info(str(file_hashes))
else:
log.info("None")

View File

@ -0,0 +1 @@
__version__ = "2.1.0"

View File

@ -0,0 +1,96 @@
import atexit
import logging
import click
import urllib3
from rich import traceback
from rich.console import Group
from rich.padding import Padding
from rich.text import Text
from urllib3.exceptions import InsecureRequestWarning
from unshackle.core import __version__
from unshackle.core.commands import Commands
from unshackle.core.config import config
from unshackle.core.console import ComfyRichHandler, console
from unshackle.core.constants import context_settings
from unshackle.core.update_checker import UpdateChecker
from unshackle.core.utilities import close_debug_logger, init_debug_logger
@click.command(cls=Commands, invoke_without_command=True, context_settings=context_settings)
@click.option("-v", "--version", is_flag=True, default=False, help="Print version information.")
@click.option("-d", "--debug", is_flag=True, default=False, help="Enable DEBUG level logs and JSON debug logging.")
def main(version: bool, debug: bool) -> None:
"""unshackle—Modular Movie, TV, and Music Archival Software."""
debug_logging_enabled = debug or config.debug
logging.basicConfig(
level=logging.DEBUG if debug else logging.INFO,
format="%(message)s",
handlers=[
ComfyRichHandler(
show_time=False,
show_path=debug,
console=console,
rich_tracebacks=True,
tracebacks_suppress=[click],
log_renderer=console._log_render, # noqa
)
],
)
if debug_logging_enabled:
init_debug_logger(enabled=True)
urllib3.disable_warnings(InsecureRequestWarning)
traceback.install(console=console, width=80, suppress=[click])
console.print(
Padding(
Group(
Text(
r"▄• ▄▌ ▐ ▄ .▄▄ · ▄ .▄ ▄▄▄· ▄▄· ▄ •▄ ▄▄▌ ▄▄▄ ." + "\n"
r"█▪██▌•█▌▐█▐█ ▀. ██▪▐█▐█ ▀█ ▐█ ▌▪█▌▄▌▪██• ▀▄.▀·" + "\n"
r"█▌▐█▌▐█▐▐▌▄▀▀▀█▄██▀▐█▄█▀▀█ ██ ▄▄▐▀▀▄·██▪ ▐▀▀▪▄" + "\n"
r"▐█▄█▌██▐█▌▐█▄▪▐███▌▐▀▐█ ▪▐▌▐███▌▐█.█▌▐█▌▐▌▐█▄▄▌" + "\n"
r" ▀▀▀ ▀▀ █▪ ▀▀▀▀ ▀▀▀ · ▀ ▀ ·▀▀▀ ·▀ ▀.▀▀▀ ▀▀▀ ",
style="ascii.art",
),
f"v [repr.number]{__version__}[/] - © 2025 - github.com/unshackle-dl/unshackle",
),
(1, 11, 1, 10),
expand=True,
),
justify="center",
)
if version:
return
if config.update_checks:
try:
latest_version = UpdateChecker.check_for_updates_sync(__version__)
if latest_version:
console.print(
f"\n[yellow]⚠️ Update available![/yellow] "
f"Current: {__version__} → Latest: [green]{latest_version}[/green]",
justify="center",
)
console.print(
"Visit: https://github.com/unshackle-dl/unshackle/releases/latest\n",
justify="center",
)
except Exception:
pass
@atexit.register
def cleanup():
"""Clean up resources on exit."""
close_debug_logger()
if __name__ == "__main__":
main()

View File

@ -0,0 +1,3 @@
from unshackle.core.api.routes import cors_middleware, setup_routes, setup_swagger
__all__ = ["setup_routes", "setup_swagger", "cors_middleware"]

View File

@ -0,0 +1,660 @@
import asyncio
import json
import logging
import os
import sys
import tempfile
import threading
import uuid
from contextlib import suppress
from dataclasses import dataclass, field
from datetime import datetime, timedelta
from enum import Enum
from typing import Any, Callable, Dict, List, Optional
log = logging.getLogger("download_manager")
class JobStatus(Enum):
QUEUED = "queued"
DOWNLOADING = "downloading"
COMPLETED = "completed"
FAILED = "failed"
CANCELLED = "cancelled"
@dataclass
class DownloadJob:
"""Represents a download job with all its parameters and status."""
job_id: str
status: JobStatus
created_time: datetime
service: str
title_id: str
parameters: Dict[str, Any]
# Progress tracking
started_time: Optional[datetime] = None
completed_time: Optional[datetime] = None
progress: float = 0.0
# Results and error info
output_files: List[str] = field(default_factory=list)
error_message: Optional[str] = None
error_details: Optional[str] = None
error_code: Optional[str] = None
error_traceback: Optional[str] = None
worker_stderr: Optional[str] = None
# Cancellation support
cancel_event: threading.Event = field(default_factory=threading.Event)
def to_dict(self, include_full_details: bool = False) -> Dict[str, Any]:
"""Convert job to dictionary for JSON response."""
result = {
"job_id": self.job_id,
"status": self.status.value,
"created_time": self.created_time.isoformat(),
"service": self.service,
"title_id": self.title_id,
"progress": self.progress,
}
if include_full_details:
result.update(
{
"parameters": self.parameters,
"started_time": self.started_time.isoformat() if self.started_time else None,
"completed_time": self.completed_time.isoformat() if self.completed_time else None,
"output_files": self.output_files,
"error_message": self.error_message,
"error_details": self.error_details,
"error_code": self.error_code,
"error_traceback": self.error_traceback,
"worker_stderr": self.worker_stderr,
}
)
return result
def _perform_download(
job_id: str,
service: str,
title_id: str,
params: Dict[str, Any],
cancel_event: Optional[threading.Event] = None,
progress_callback: Optional[Callable[[Dict[str, Any]], None]] = None,
) -> List[str]:
"""Execute the synchronous download logic for a job."""
def _check_cancel(stage: str):
if cancel_event and cancel_event.is_set():
raise Exception(f"Job was cancelled {stage}")
from contextlib import redirect_stderr, redirect_stdout
from io import StringIO
_check_cancel("before execution started")
# Import dl.py components lazily to avoid circular deps during module import
import click
import yaml
from unshackle.commands.dl import dl
from unshackle.core.config import config
from unshackle.core.services import Services
from unshackle.core.utils.click_types import ContextData
from unshackle.core.utils.collections import merge_dict
log.info(f"Starting sync download for job {job_id}")
# Load service configuration
service_config_path = Services.get_path(service) / config.filenames.config
if service_config_path.exists():
service_config = yaml.safe_load(service_config_path.read_text(encoding="utf8"))
else:
service_config = {}
merge_dict(config.services.get(service), service_config)
from unshackle.commands.dl import dl as dl_command
ctx = click.Context(dl_command.cli)
ctx.invoked_subcommand = service
ctx.obj = ContextData(config=service_config, cdm=None, proxy_providers=[], profile=params.get("profile"))
ctx.params = {
"proxy": params.get("proxy"),
"no_proxy": params.get("no_proxy", False),
"profile": params.get("profile"),
"tag": params.get("tag"),
"tmdb_id": params.get("tmdb_id"),
"tmdb_name": params.get("tmdb_name", False),
"tmdb_year": params.get("tmdb_year", False),
}
dl_instance = dl(
ctx=ctx,
no_proxy=params.get("no_proxy", False),
profile=params.get("profile"),
proxy=params.get("proxy"),
tag=params.get("tag"),
tmdb_id=params.get("tmdb_id"),
tmdb_name=params.get("tmdb_name", False),
tmdb_year=params.get("tmdb_year", False),
)
service_module = Services.load(service)
_check_cancel("before service instantiation")
try:
import inspect
service_init_params = inspect.signature(service_module.__init__).parameters
service_ctx = click.Context(click.Command(service))
service_ctx.parent = ctx
service_ctx.obj = ctx.obj
service_kwargs = {}
if "title" in service_init_params:
service_kwargs["title"] = title_id
for key, value in params.items():
if key in service_init_params and key not in ["service", "title_id"]:
service_kwargs[key] = value
for param_name, param_info in service_init_params.items():
if param_name not in service_kwargs and param_name not in ["self", "ctx"]:
if param_info.default is inspect.Parameter.empty:
if param_name == "movie":
service_kwargs[param_name] = "/movies/" in title_id
elif param_name == "meta_lang":
service_kwargs[param_name] = None
else:
log.warning(f"Unknown required parameter '{param_name}' for service {service}, using None")
service_kwargs[param_name] = None
service_instance = service_module(service_ctx, **service_kwargs)
except Exception as exc: # noqa: BLE001 - propagate meaningful failure
log.error(f"Failed to create service instance: {exc}")
raise
original_download_dir = config.directories.downloads
_check_cancel("before download execution")
stdout_capture = StringIO()
stderr_capture = StringIO()
# Simple progress tracking if callback provided
if progress_callback:
# Report initial progress
progress_callback({"progress": 0.0, "status": "starting"})
# Simple approach: report progress at key points
original_result = dl_instance.result
def result_with_progress(*args, **kwargs):
try:
# Report that download started
progress_callback({"progress": 5.0, "status": "downloading"})
# Call original method
result = original_result(*args, **kwargs)
# Report completion
progress_callback({"progress": 100.0, "status": "completed"})
return result
except Exception as e:
progress_callback({"progress": 0.0, "status": "failed", "error": str(e)})
raise
dl_instance.result = result_with_progress
try:
with redirect_stdout(stdout_capture), redirect_stderr(stderr_capture):
dl_instance.result(
service=service_instance,
quality=params.get("quality", []),
vcodec=params.get("vcodec"),
acodec=params.get("acodec"),
vbitrate=params.get("vbitrate"),
abitrate=params.get("abitrate"),
range_=params.get("range", ["SDR"]),
channels=params.get("channels"),
no_atmos=params.get("no_atmos", False),
wanted=params.get("wanted", []),
latest_episode=params.get("latest_episode", False),
lang=params.get("lang", ["orig"]),
v_lang=params.get("v_lang", []),
a_lang=params.get("a_lang", []),
s_lang=params.get("s_lang", ["all"]),
require_subs=params.get("require_subs", []),
forced_subs=params.get("forced_subs", False),
exact_lang=params.get("exact_lang", False),
sub_format=params.get("sub_format"),
video_only=params.get("video_only", False),
audio_only=params.get("audio_only", False),
subs_only=params.get("subs_only", False),
chapters_only=params.get("chapters_only", False),
no_subs=params.get("no_subs", False),
no_audio=params.get("no_audio", False),
no_chapters=params.get("no_chapters", False),
audio_description=params.get("audio_description", False),
slow=params.get("slow", False),
list_=False,
list_titles=False,
skip_dl=params.get("skip_dl", False),
export=params.get("export"),
cdm_only=params.get("cdm_only"),
no_proxy=params.get("no_proxy", False),
no_folder=params.get("no_folder", False),
no_source=params.get("no_source", False),
no_mux=params.get("no_mux", False),
workers=params.get("workers"),
downloads=params.get("downloads", 1),
best_available=params.get("best_available", False),
)
except SystemExit as exc:
if exc.code != 0:
stdout_str = stdout_capture.getvalue()
stderr_str = stderr_capture.getvalue()
log.error(f"Download exited with code {exc.code}")
log.error(f"Stdout: {stdout_str}")
log.error(f"Stderr: {stderr_str}")
raise Exception(f"Download failed with exit code {exc.code}")
except Exception as exc: # noqa: BLE001 - propagate to caller
stdout_str = stdout_capture.getvalue()
stderr_str = stderr_capture.getvalue()
log.error(f"Download execution failed: {exc}")
log.error(f"Stdout: {stdout_str}")
log.error(f"Stderr: {stderr_str}")
raise
log.info(f"Download completed for job {job_id}, files in {original_download_dir}")
return []
class DownloadQueueManager:
"""Manages download job queue with configurable concurrency limits."""
def __init__(self, max_concurrent_downloads: int = 2, job_retention_hours: int = 24):
self.max_concurrent_downloads = max_concurrent_downloads
self.job_retention_hours = job_retention_hours
self._jobs: Dict[str, DownloadJob] = {}
self._job_queue: asyncio.Queue = asyncio.Queue()
self._active_downloads: Dict[str, asyncio.Task] = {}
self._download_processes: Dict[str, asyncio.subprocess.Process] = {}
self._job_temp_files: Dict[str, Dict[str, str]] = {}
self._workers_started = False
self._shutdown_event = asyncio.Event()
log.info(
f"Initialized download queue manager: max_concurrent={max_concurrent_downloads}, retention_hours={job_retention_hours}"
)
def create_job(self, service: str, title_id: str, **parameters) -> DownloadJob:
"""Create a new download job and add it to the queue."""
job_id = str(uuid.uuid4())
job = DownloadJob(
job_id=job_id,
status=JobStatus.QUEUED,
created_time=datetime.now(),
service=service,
title_id=title_id,
parameters=parameters,
)
self._jobs[job_id] = job
self._job_queue.put_nowait(job)
log.info(f"Created download job {job_id} for {service}:{title_id}")
return job
def get_job(self, job_id: str) -> Optional[DownloadJob]:
"""Get job by ID."""
return self._jobs.get(job_id)
def list_jobs(self) -> List[DownloadJob]:
"""List all jobs."""
return list(self._jobs.values())
def cancel_job(self, job_id: str) -> bool:
"""Cancel a job if it's queued or downloading."""
job = self._jobs.get(job_id)
if not job:
return False
if job.status == JobStatus.QUEUED:
job.status = JobStatus.CANCELLED
job.cancel_event.set() # Signal cancellation
log.info(f"Cancelled queued job {job_id}")
return True
elif job.status == JobStatus.DOWNLOADING:
# Set the cancellation event first - this will be checked by the download thread
job.cancel_event.set()
job.status = JobStatus.CANCELLED
log.info(f"Signaled cancellation for downloading job {job_id}")
# Cancel the active download task
task = self._active_downloads.get(job_id)
if task:
task.cancel()
log.info(f"Cancelled download task for job {job_id}")
process = self._download_processes.get(job_id)
if process:
try:
process.terminate()
log.info(f"Terminated worker process for job {job_id}")
except ProcessLookupError:
log.debug(f"Worker process for job {job_id} already exited")
return True
return False
def cleanup_old_jobs(self) -> int:
"""Remove jobs older than retention period."""
cutoff_time = datetime.now() - timedelta(hours=self.job_retention_hours)
jobs_to_remove = []
for job_id, job in self._jobs.items():
if job.status in [JobStatus.COMPLETED, JobStatus.FAILED, JobStatus.CANCELLED]:
if job.completed_time and job.completed_time < cutoff_time:
jobs_to_remove.append(job_id)
elif not job.completed_time and job.created_time < cutoff_time:
jobs_to_remove.append(job_id)
for job_id in jobs_to_remove:
del self._jobs[job_id]
if jobs_to_remove:
log.info(f"Cleaned up {len(jobs_to_remove)} old jobs")
return len(jobs_to_remove)
async def start_workers(self):
"""Start worker tasks to process the download queue."""
if self._workers_started:
return
self._workers_started = True
# Start worker tasks
for i in range(self.max_concurrent_downloads):
asyncio.create_task(self._download_worker(f"worker-{i}"))
# Start cleanup task
asyncio.create_task(self._cleanup_worker())
log.info(f"Started {self.max_concurrent_downloads} download workers")
async def shutdown(self):
"""Shutdown the queue manager and cancel all active downloads."""
log.info("Shutting down download queue manager")
self._shutdown_event.set()
# Cancel all active downloads
for task in self._active_downloads.values():
task.cancel()
# Terminate worker processes
for job_id, process in list(self._download_processes.items()):
try:
process.terminate()
except ProcessLookupError:
log.debug(f"Worker process for job {job_id} already exited during shutdown")
for job_id, process in list(self._download_processes.items()):
try:
await asyncio.wait_for(process.wait(), timeout=5)
except asyncio.TimeoutError:
log.warning(f"Worker process for job {job_id} did not exit, killing")
process.kill()
await process.wait()
finally:
self._download_processes.pop(job_id, None)
# Clean up any remaining temp files
for paths in self._job_temp_files.values():
for path in paths.values():
try:
os.remove(path)
except OSError:
pass
self._job_temp_files.clear()
# Wait for workers to finish
if self._active_downloads:
await asyncio.gather(*self._active_downloads.values(), return_exceptions=True)
async def _download_worker(self, worker_name: str):
"""Worker task that processes jobs from the queue."""
log.debug(f"Download worker {worker_name} started")
while not self._shutdown_event.is_set():
try:
# Wait for a job or shutdown signal
job = await asyncio.wait_for(self._job_queue.get(), timeout=1.0)
if job.status == JobStatus.CANCELLED:
continue
# Start processing the job
job.status = JobStatus.DOWNLOADING
job.started_time = datetime.now()
log.info(f"Worker {worker_name} starting job {job.job_id}")
# Create download task
download_task = asyncio.create_task(self._execute_download(job))
self._active_downloads[job.job_id] = download_task
try:
await download_task
except asyncio.CancelledError:
job.status = JobStatus.CANCELLED
log.info(f"Job {job.job_id} was cancelled")
except Exception as e:
job.status = JobStatus.FAILED
job.error_message = str(e)
log.error(f"Job {job.job_id} failed: {e}")
finally:
job.completed_time = datetime.now()
if job.job_id in self._active_downloads:
del self._active_downloads[job.job_id]
except asyncio.TimeoutError:
continue
except Exception as e:
log.error(f"Worker {worker_name} error: {e}")
async def _execute_download(self, job: DownloadJob):
"""Execute the actual download for a job."""
log.info(f"Executing download for job {job.job_id}")
try:
output_files = await self._run_download_async(job)
job.status = JobStatus.COMPLETED
job.output_files = output_files
job.progress = 100.0
log.info(f"Download completed for job {job.job_id}: {len(output_files)} files")
except Exception as e:
import traceback
from unshackle.core.api.errors import categorize_exception
job.status = JobStatus.FAILED
job.error_message = str(e)
job.error_details = str(e)
api_error = categorize_exception(
e, context={"service": job.service, "title_id": job.title_id, "job_id": job.job_id}
)
job.error_code = api_error.error_code.value
job.error_traceback = traceback.format_exc()
log.error(f"Download failed for job {job.job_id}: {e}")
raise
async def _run_download_async(self, job: DownloadJob) -> List[str]:
"""Invoke a worker subprocess to execute the download."""
payload = {
"job_id": job.job_id,
"service": job.service,
"title_id": job.title_id,
"parameters": job.parameters,
}
payload_fd, payload_path = tempfile.mkstemp(prefix=f"unshackle_job_{job.job_id}_", suffix="_payload.json")
os.close(payload_fd)
result_fd, result_path = tempfile.mkstemp(prefix=f"unshackle_job_{job.job_id}_", suffix="_result.json")
os.close(result_fd)
progress_fd, progress_path = tempfile.mkstemp(prefix=f"unshackle_job_{job.job_id}_", suffix="_progress.json")
os.close(progress_fd)
with open(payload_path, "w", encoding="utf-8") as handle:
json.dump(payload, handle)
process = await asyncio.create_subprocess_exec(
sys.executable,
"-m",
"unshackle.core.api.download_worker",
payload_path,
result_path,
progress_path,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
self._download_processes[job.job_id] = process
self._job_temp_files[job.job_id] = {"payload": payload_path, "result": result_path, "progress": progress_path}
communicate_task = asyncio.create_task(process.communicate())
stdout_bytes = b""
stderr_bytes = b""
try:
while True:
done, _ = await asyncio.wait({communicate_task}, timeout=0.5)
if communicate_task in done:
stdout_bytes, stderr_bytes = communicate_task.result()
break
# Check for progress updates
try:
if os.path.exists(progress_path):
with open(progress_path, "r", encoding="utf-8") as handle:
progress_data = json.load(handle)
if "progress" in progress_data:
new_progress = float(progress_data["progress"])
if new_progress != job.progress:
job.progress = new_progress
log.info(f"Job {job.job_id} progress updated: {job.progress}%")
except (FileNotFoundError, json.JSONDecodeError, ValueError) as e:
log.debug(f"Could not read progress for job {job.job_id}: {e}")
if job.cancel_event.is_set() or job.status == JobStatus.CANCELLED:
log.info(f"Cancellation detected for job {job.job_id}, terminating worker process")
process.terminate()
try:
await asyncio.wait_for(communicate_task, timeout=5)
except asyncio.TimeoutError:
log.warning(f"Worker process for job {job.job_id} did not terminate, killing")
process.kill()
await asyncio.wait_for(communicate_task, timeout=5)
raise asyncio.CancelledError("Job was cancelled")
returncode = process.returncode
stdout = stdout_bytes.decode("utf-8", errors="ignore")
stderr = stderr_bytes.decode("utf-8", errors="ignore")
if stdout.strip():
log.debug(f"Worker stdout for job {job.job_id}: {stdout.strip()}")
if stderr.strip():
log.warning(f"Worker stderr for job {job.job_id}: {stderr.strip()}")
job.worker_stderr = stderr.strip()
result_data: Optional[Dict[str, Any]] = None
try:
with open(result_path, "r", encoding="utf-8") as handle:
result_data = json.load(handle)
except FileNotFoundError:
log.error(f"Result file missing for job {job.job_id}")
except json.JSONDecodeError as exc:
log.error(f"Failed to parse worker result for job {job.job_id}: {exc}")
if returncode != 0:
message = result_data.get("message") if result_data else "unknown error"
if result_data:
job.error_details = result_data.get("error_details", message)
job.error_code = result_data.get("error_code")
raise Exception(f"Worker exited with code {returncode}: {message}")
if not result_data or result_data.get("status") != "success":
message = result_data.get("message") if result_data else "worker did not report success"
if result_data:
job.error_details = result_data.get("error_details", message)
job.error_code = result_data.get("error_code")
raise Exception(f"Worker failure: {message}")
return result_data.get("output_files", [])
finally:
if not communicate_task.done():
communicate_task.cancel()
with suppress(asyncio.CancelledError):
await communicate_task
self._download_processes.pop(job.job_id, None)
temp_paths = self._job_temp_files.pop(job.job_id, {})
for path in temp_paths.values():
try:
os.remove(path)
except OSError:
pass
def _execute_download_sync(self, job: DownloadJob) -> List[str]:
"""Execute download synchronously using existing dl.py logic."""
return _perform_download(job.job_id, job.service, job.title_id, job.parameters.copy(), job.cancel_event)
async def _cleanup_worker(self):
"""Worker that periodically cleans up old jobs."""
while not self._shutdown_event.is_set():
try:
await asyncio.sleep(3600) # Run every hour
self.cleanup_old_jobs()
except Exception as e:
log.error(f"Cleanup worker error: {e}")
# Global instance
download_manager: Optional[DownloadQueueManager] = None
def get_download_manager() -> DownloadQueueManager:
"""Get the global download manager instance."""
global download_manager
if download_manager is None:
# Load configuration from unshackle config
from unshackle.core.config import config
max_concurrent = getattr(config, "max_concurrent_downloads", 2)
retention_hours = getattr(config, "download_job_retention_hours", 24)
download_manager = DownloadQueueManager(max_concurrent, retention_hours)
return download_manager

View File

@ -0,0 +1,102 @@
"""Standalone worker process entry point for executing download jobs."""
from __future__ import annotations
import json
import logging
import sys
import traceback
from pathlib import Path
from typing import Any, Dict
from .download_manager import _perform_download
log = logging.getLogger("download_worker")
def _read_payload(path: Path) -> Dict[str, Any]:
with path.open("r", encoding="utf-8") as handle:
return json.load(handle)
def _write_result(path: Path, payload: Dict[str, Any]) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
with path.open("w", encoding="utf-8") as handle:
json.dump(payload, handle)
def main(argv: list[str]) -> int:
if len(argv) not in [3, 4]:
print(
"Usage: python -m unshackle.core.api.download_worker <payload_path> <result_path> [progress_path]",
file=sys.stderr,
)
return 2
payload_path = Path(argv[1])
result_path = Path(argv[2])
progress_path = Path(argv[3]) if len(argv) > 3 else None
result: Dict[str, Any] = {}
exit_code = 0
try:
payload = _read_payload(payload_path)
job_id = payload["job_id"]
service = payload["service"]
title_id = payload["title_id"]
params = payload.get("parameters", {})
log.info(f"Worker starting job {job_id} ({service}:{title_id})")
def progress_callback(progress_data: Dict[str, Any]) -> None:
"""Write progress updates to file for main process to read."""
if progress_path:
try:
log.info(f"Writing progress update: {progress_data}")
_write_result(progress_path, progress_data)
log.info(f"Progress update written to {progress_path}")
except Exception as e:
log.error(f"Failed to write progress update: {e}")
output_files = _perform_download(
job_id, service, title_id, params, cancel_event=None, progress_callback=progress_callback
)
result = {"status": "success", "output_files": output_files}
except Exception as exc: # noqa: BLE001 - capture for parent process
from unshackle.core.api.errors import categorize_exception
exit_code = 1
tb = traceback.format_exc()
log.error(f"Worker failed with error: {exc}")
api_error = categorize_exception(
exc,
context={
"service": payload.get("service") if "payload" in locals() else None,
"title_id": payload.get("title_id") if "payload" in locals() else None,
"job_id": payload.get("job_id") if "payload" in locals() else None,
},
)
result = {
"status": "error",
"message": str(exc),
"error_details": api_error.message,
"error_code": api_error.error_code.value,
"traceback": tb,
}
finally:
try:
_write_result(result_path, result)
except Exception as exc: # noqa: BLE001 - last resort logging
log.error(f"Failed to write worker result file: {exc}")
return exit_code
if __name__ == "__main__":
sys.exit(main(sys.argv))

View File

@ -0,0 +1,322 @@
"""
API Error Handling System
Provides structured error responses with error codes, categorization,
and optional debug information for the unshackle REST API.
"""
from __future__ import annotations
import traceback
from datetime import datetime, timezone
from enum import Enum
from typing import Any
from aiohttp import web
class APIErrorCode(str, Enum):
"""Standard API error codes for programmatic error handling."""
# Client errors (4xx)
INVALID_INPUT = "INVALID_INPUT" # Missing or malformed request data
INVALID_SERVICE = "INVALID_SERVICE" # Unknown service name
INVALID_TITLE_ID = "INVALID_TITLE_ID" # Invalid or malformed title ID
INVALID_PROFILE = "INVALID_PROFILE" # Profile doesn't exist
INVALID_PROXY = "INVALID_PROXY" # Invalid proxy specification
INVALID_LANGUAGE = "INVALID_LANGUAGE" # Invalid language code
INVALID_PARAMETERS = "INVALID_PARAMETERS" # Invalid download parameters
AUTH_FAILED = "AUTH_FAILED" # Authentication failure (invalid credentials/cookies)
AUTH_REQUIRED = "AUTH_REQUIRED" # Missing authentication
FORBIDDEN = "FORBIDDEN" # Action not allowed
GEOFENCE = "GEOFENCE" # Content not available in region
NOT_FOUND = "NOT_FOUND" # Resource not found (title, job, etc.)
NO_CONTENT = "NO_CONTENT" # No titles/tracks/episodes found
JOB_NOT_FOUND = "JOB_NOT_FOUND" # Download job doesn't exist
RATE_LIMITED = "RATE_LIMITED" # Service rate limiting
# Server errors (5xx)
INTERNAL_ERROR = "INTERNAL_ERROR" # Unexpected server error
SERVICE_ERROR = "SERVICE_ERROR" # Streaming service API error
NETWORK_ERROR = "NETWORK_ERROR" # Network connectivity issue
DRM_ERROR = "DRM_ERROR" # DRM/license acquisition failure
DOWNLOAD_ERROR = "DOWNLOAD_ERROR" # Download process failure
SERVICE_UNAVAILABLE = "SERVICE_UNAVAILABLE" # Service temporarily unavailable
WORKER_ERROR = "WORKER_ERROR" # Download worker process error
class APIError(Exception):
"""
Structured API error with error code, message, and details.
Attributes:
error_code: Standardized error code from APIErrorCode enum
message: User-friendly error message
details: Additional structured error information
retryable: Whether the operation can be retried
http_status: HTTP status code to return (default based on error_code)
"""
def __init__(
self,
error_code: APIErrorCode,
message: str,
details: dict[str, Any] | None = None,
retryable: bool = False,
http_status: int | None = None,
):
super().__init__(message)
self.error_code = error_code
self.message = message
self.details = details or {}
self.retryable = retryable
self.http_status = http_status or self._default_http_status(error_code)
@staticmethod
def _default_http_status(error_code: APIErrorCode) -> int:
"""Map error codes to default HTTP status codes."""
status_map = {
# 400 Bad Request
APIErrorCode.INVALID_INPUT: 400,
APIErrorCode.INVALID_SERVICE: 400,
APIErrorCode.INVALID_TITLE_ID: 400,
APIErrorCode.INVALID_PROFILE: 400,
APIErrorCode.INVALID_PROXY: 400,
APIErrorCode.INVALID_LANGUAGE: 400,
APIErrorCode.INVALID_PARAMETERS: 400,
# 401 Unauthorized
APIErrorCode.AUTH_REQUIRED: 401,
APIErrorCode.AUTH_FAILED: 401,
# 403 Forbidden
APIErrorCode.FORBIDDEN: 403,
APIErrorCode.GEOFENCE: 403,
# 404 Not Found
APIErrorCode.NOT_FOUND: 404,
APIErrorCode.NO_CONTENT: 404,
APIErrorCode.JOB_NOT_FOUND: 404,
# 429 Too Many Requests
APIErrorCode.RATE_LIMITED: 429,
# 500 Internal Server Error
APIErrorCode.INTERNAL_ERROR: 500,
# 502 Bad Gateway
APIErrorCode.SERVICE_ERROR: 502,
APIErrorCode.DRM_ERROR: 502,
# 503 Service Unavailable
APIErrorCode.NETWORK_ERROR: 503,
APIErrorCode.SERVICE_UNAVAILABLE: 503,
APIErrorCode.DOWNLOAD_ERROR: 500,
APIErrorCode.WORKER_ERROR: 500,
}
return status_map.get(error_code, 500)
def build_error_response(
error: APIError | Exception,
debug_mode: bool = False,
extra_debug_info: dict[str, Any] | None = None,
) -> web.Response:
"""
Build a structured JSON error response.
Args:
error: APIError or generic Exception to convert to response
debug_mode: Whether to include technical debug information
extra_debug_info: Additional debug info (stderr, stdout, etc.)
Returns:
aiohttp JSON response with structured error data
"""
if isinstance(error, APIError):
error_code = error.error_code.value
message = error.message
details = error.details
http_status = error.http_status
retryable = error.retryable
else:
# Generic exception - convert to INTERNAL_ERROR
error_code = APIErrorCode.INTERNAL_ERROR.value
message = str(error) or "An unexpected error occurred"
details = {}
http_status = 500
retryable = False
response_data: dict[str, Any] = {
"status": "error",
"error_code": error_code,
"message": message,
"timestamp": datetime.now(timezone.utc).isoformat(),
}
# Add details if present
if details:
response_data["details"] = details
# Add retryable hint if specified
if retryable:
response_data["retryable"] = True
# Add debug information if in debug mode
if debug_mode:
debug_info: dict[str, Any] = {
"exception_type": type(error).__name__,
}
# Add traceback for debugging
if isinstance(error, Exception):
debug_info["traceback"] = traceback.format_exc()
# Add any extra debug info provided
if extra_debug_info:
debug_info.update(extra_debug_info)
response_data["debug_info"] = debug_info
return web.json_response(response_data, status=http_status)
def categorize_exception(
exc: Exception,
context: dict[str, Any] | None = None,
) -> APIError:
"""
Categorize a generic exception into a structured APIError.
This function attempts to identify the type of error based on the exception
type, message patterns, and optional context information.
Args:
exc: The exception to categorize
context: Optional context (service name, operation type, etc.)
Returns:
APIError with appropriate error code and details
"""
context = context or {}
exc_str = str(exc).lower()
exc_type = type(exc).__name__
# Authentication errors
if any(keyword in exc_str for keyword in ["auth", "login", "credential", "unauthorized", "forbidden", "token"]):
return APIError(
error_code=APIErrorCode.AUTH_FAILED,
message=f"Authentication failed: {exc}",
details={**context, "reason": "authentication_error"},
retryable=False,
)
# Network errors
if any(
keyword in exc_str
for keyword in [
"connection",
"timeout",
"network",
"unreachable",
"socket",
"dns",
"resolve",
]
) or exc_type in ["ConnectionError", "TimeoutError", "URLError", "SSLError"]:
return APIError(
error_code=APIErrorCode.NETWORK_ERROR,
message=f"Network error occurred: {exc}",
details={**context, "reason": "network_connectivity"},
retryable=True,
http_status=503,
)
# Geofence/region errors
if any(keyword in exc_str for keyword in ["geofence", "region", "not available in", "territory"]):
return APIError(
error_code=APIErrorCode.GEOFENCE,
message=f"Content not available in your region: {exc}",
details={**context, "reason": "geofence_restriction"},
retryable=False,
)
# Not found errors
if any(keyword in exc_str for keyword in ["not found", "404", "does not exist", "invalid id"]):
return APIError(
error_code=APIErrorCode.NOT_FOUND,
message=f"Resource not found: {exc}",
details={**context, "reason": "not_found"},
retryable=False,
)
# Rate limiting
if any(keyword in exc_str for keyword in ["rate limit", "too many requests", "429", "throttle"]):
return APIError(
error_code=APIErrorCode.RATE_LIMITED,
message=f"Rate limit exceeded: {exc}",
details={**context, "reason": "rate_limited"},
retryable=True,
http_status=429,
)
# DRM errors
if any(keyword in exc_str for keyword in ["drm", "license", "widevine", "playready", "decrypt"]):
return APIError(
error_code=APIErrorCode.DRM_ERROR,
message=f"DRM error: {exc}",
details={**context, "reason": "drm_failure"},
retryable=False,
)
# Service unavailable
if any(keyword in exc_str for keyword in ["service unavailable", "503", "maintenance", "temporarily unavailable"]):
return APIError(
error_code=APIErrorCode.SERVICE_UNAVAILABLE,
message=f"Service temporarily unavailable: {exc}",
details={**context, "reason": "service_unavailable"},
retryable=True,
http_status=503,
)
# Validation errors
if any(keyword in exc_str for keyword in ["invalid", "malformed", "validation"]) or exc_type in [
"ValueError",
"ValidationError",
]:
return APIError(
error_code=APIErrorCode.INVALID_INPUT,
message=f"Invalid input: {exc}",
details={**context, "reason": "validation_failed"},
retryable=False,
)
# Default to internal error for unknown exceptions
return APIError(
error_code=APIErrorCode.INTERNAL_ERROR,
message=f"An unexpected error occurred: {exc}",
details={**context, "exception_type": exc_type},
retryable=False,
)
def handle_api_exception(
exc: Exception,
context: dict[str, Any] | None = None,
debug_mode: bool = False,
extra_debug_info: dict[str, Any] | None = None,
) -> web.Response:
"""
Convenience function to categorize an exception and build an error response.
Args:
exc: The exception to handle
context: Optional context information
debug_mode: Whether to include debug information
extra_debug_info: Additional debug info
Returns:
Structured JSON error response
"""
if isinstance(exc, APIError):
api_error = exc
else:
api_error = categorize_exception(exc, context)
return build_error_response(api_error, debug_mode, extra_debug_info)

View File

@ -0,0 +1,936 @@
import logging
from typing import Any, Dict, List, Optional
from aiohttp import web
from unshackle.core.api.errors import APIError, APIErrorCode, handle_api_exception
from unshackle.core.constants import AUDIO_CODEC_MAP, DYNAMIC_RANGE_MAP, VIDEO_CODEC_MAP
from unshackle.core.proxies.basic import Basic
from unshackle.core.proxies.hola import Hola
from unshackle.core.proxies.nordvpn import NordVPN
from unshackle.core.proxies.surfsharkvpn import SurfsharkVPN
from unshackle.core.services import Services
from unshackle.core.titles import Episode, Movie, Title_T
from unshackle.core.tracks import Audio, Subtitle, Video
log = logging.getLogger("api")
DEFAULT_DOWNLOAD_PARAMS = {
"profile": None,
"quality": [],
"vcodec": None,
"acodec": None,
"vbitrate": None,
"abitrate": None,
"range": ["SDR"],
"channels": None,
"no_atmos": False,
"wanted": [],
"latest_episode": False,
"lang": ["orig"],
"v_lang": [],
"a_lang": [],
"s_lang": ["all"],
"require_subs": [],
"forced_subs": False,
"exact_lang": False,
"sub_format": None,
"video_only": False,
"audio_only": False,
"subs_only": False,
"chapters_only": False,
"no_subs": False,
"no_audio": False,
"no_chapters": False,
"audio_description": False,
"slow": False,
"skip_dl": False,
"export": None,
"cdm_only": None,
"no_proxy": False,
"no_folder": False,
"no_source": False,
"no_mux": False,
"workers": None,
"downloads": 1,
"best_available": False,
}
def initialize_proxy_providers() -> List[Any]:
"""Initialize and return available proxy providers."""
proxy_providers = []
try:
from unshackle.core import binaries
# Load the main unshackle config to get proxy provider settings
from unshackle.core.config import config as main_config
log.debug(f"Main config proxy providers: {getattr(main_config, 'proxy_providers', {})}")
log.debug(f"Available proxy provider configs: {list(getattr(main_config, 'proxy_providers', {}).keys())}")
# Use main_config instead of the service-specific config for proxy providers
proxy_config = getattr(main_config, "proxy_providers", {})
if proxy_config.get("basic"):
log.debug("Loading Basic proxy provider")
proxy_providers.append(Basic(**proxy_config["basic"]))
if proxy_config.get("nordvpn"):
log.debug("Loading NordVPN proxy provider")
proxy_providers.append(NordVPN(**proxy_config["nordvpn"]))
if proxy_config.get("surfsharkvpn"):
log.debug("Loading SurfsharkVPN proxy provider")
proxy_providers.append(SurfsharkVPN(**proxy_config["surfsharkvpn"]))
if hasattr(binaries, "HolaProxy") and binaries.HolaProxy:
log.debug("Loading Hola proxy provider")
proxy_providers.append(Hola())
for proxy_provider in proxy_providers:
log.info(f"Loaded {proxy_provider.__class__.__name__}: {proxy_provider}")
if not proxy_providers:
log.warning("No proxy providers were loaded. Check your proxy provider configuration in unshackle.yaml")
except Exception as e:
log.warning(f"Failed to initialize some proxy providers: {e}")
return proxy_providers
def resolve_proxy(proxy: str, proxy_providers: List[Any]) -> str:
"""Resolve proxy parameter to actual proxy URI."""
import re
if not proxy:
return proxy
# Check if explicit proxy URI
if re.match(r"^https?://", proxy):
return proxy
# Handle provider:country format (e.g., "nordvpn:us")
requested_provider = None
if re.match(r"^[a-z]+:.+$", proxy, re.IGNORECASE):
requested_provider, proxy = proxy.split(":", maxsplit=1)
# Handle country code format (e.g., "us", "uk")
if re.match(r"^[a-z]{2}(?:\d+)?$", proxy, re.IGNORECASE):
proxy = proxy.lower()
if requested_provider:
# Find specific provider (case-insensitive matching)
proxy_provider = next(
(x for x in proxy_providers if x.__class__.__name__.lower() == requested_provider.lower()),
None,
)
if not proxy_provider:
available_providers = [x.__class__.__name__ for x in proxy_providers]
raise ValueError(
f"The proxy provider '{requested_provider}' was not recognized. Available providers: {available_providers}"
)
proxy_uri = proxy_provider.get_proxy(proxy)
if not proxy_uri:
raise ValueError(f"The proxy provider {requested_provider} had no proxy for {proxy}")
log.info(f"Using {proxy_provider.__class__.__name__} Proxy: {proxy_uri}")
return proxy_uri
else:
# Try all providers
for proxy_provider in proxy_providers:
proxy_uri = proxy_provider.get_proxy(proxy)
if proxy_uri:
log.info(f"Using {proxy_provider.__class__.__name__} Proxy: {proxy_uri}")
return proxy_uri
raise ValueError(f"No proxy provider had a proxy for {proxy}")
# Return as-is if not recognized format
log.info(f"Using explicit Proxy: {proxy}")
return proxy
def validate_service(service_tag: str) -> Optional[str]:
"""Validate and normalize service tag."""
try:
normalized = Services.get_tag(service_tag)
service_path = Services.get_path(normalized)
if not service_path.exists():
return None
return normalized
except Exception:
return None
def serialize_title(title: Title_T) -> Dict[str, Any]:
"""Convert a title object to JSON-serializable dict."""
if isinstance(title, Episode):
episode_name = title.name if title.name else f"Episode {title.number:02d}"
result = {
"type": "episode",
"name": episode_name,
"series_title": str(title.title),
"season": title.season,
"number": title.number,
"year": title.year,
"id": str(title.id) if hasattr(title, "id") else None,
}
elif isinstance(title, Movie):
result = {
"type": "movie",
"name": str(title.name) if hasattr(title, "name") else str(title),
"year": title.year,
"id": str(title.id) if hasattr(title, "id") else None,
}
else:
result = {
"type": "other",
"name": str(title.name) if hasattr(title, "name") else str(title),
"id": str(title.id) if hasattr(title, "id") else None,
}
return result
def serialize_video_track(track: Video) -> Dict[str, Any]:
"""Convert video track to JSON-serializable dict."""
codec_name = track.codec.name if hasattr(track.codec, "name") else str(track.codec)
range_name = track.range.name if hasattr(track.range, "name") else str(track.range)
return {
"id": str(track.id),
"codec": codec_name,
"codec_display": VIDEO_CODEC_MAP.get(codec_name, codec_name),
"bitrate": int(track.bitrate / 1000) if track.bitrate else None,
"width": track.width,
"height": track.height,
"resolution": f"{track.width}x{track.height}" if track.width and track.height else None,
"fps": track.fps if track.fps else None,
"range": range_name,
"range_display": DYNAMIC_RANGE_MAP.get(range_name, range_name),
"language": str(track.language) if track.language else None,
"drm": str(track.drm) if hasattr(track, "drm") and track.drm else None,
}
def serialize_audio_track(track: Audio) -> Dict[str, Any]:
"""Convert audio track to JSON-serializable dict."""
codec_name = track.codec.name if hasattr(track.codec, "name") else str(track.codec)
return {
"id": str(track.id),
"codec": codec_name,
"codec_display": AUDIO_CODEC_MAP.get(codec_name, codec_name),
"bitrate": int(track.bitrate / 1000) if track.bitrate else None,
"channels": track.channels if track.channels else None,
"language": str(track.language) if track.language else None,
"atmos": track.atmos if hasattr(track, "atmos") else False,
"descriptive": track.descriptive if hasattr(track, "descriptive") else False,
"drm": str(track.drm) if hasattr(track, "drm") and track.drm else None,
}
def serialize_subtitle_track(track: Subtitle) -> Dict[str, Any]:
"""Convert subtitle track to JSON-serializable dict."""
return {
"id": str(track.id),
"codec": track.codec.name if hasattr(track.codec, "name") else str(track.codec),
"language": str(track.language) if track.language else None,
"forced": track.forced if hasattr(track, "forced") else False,
"sdh": track.sdh if hasattr(track, "sdh") else False,
"cc": track.cc if hasattr(track, "cc") else False,
}
async def list_titles_handler(data: Dict[str, Any], request: Optional[web.Request] = None) -> web.Response:
"""Handle list-titles request."""
service_tag = data.get("service")
title_id = data.get("title_id")
profile = data.get("profile")
if not service_tag:
raise APIError(
APIErrorCode.INVALID_INPUT,
"Missing required parameter: service",
details={"missing_parameter": "service"},
)
if not title_id:
raise APIError(
APIErrorCode.INVALID_INPUT,
"Missing required parameter: title_id",
details={"missing_parameter": "title_id"},
)
normalized_service = validate_service(service_tag)
if not normalized_service:
raise APIError(
APIErrorCode.INVALID_SERVICE,
f"Invalid or unavailable service: {service_tag}",
details={"service": service_tag},
)
try:
import inspect
import click
import yaml
from unshackle.commands.dl import dl
from unshackle.core.config import config
from unshackle.core.utils.click_types import ContextData
from unshackle.core.utils.collections import merge_dict
service_config_path = Services.get_path(normalized_service) / config.filenames.config
if service_config_path.exists():
service_config = yaml.safe_load(service_config_path.read_text(encoding="utf8"))
else:
service_config = {}
merge_dict(config.services.get(normalized_service), service_config)
@click.command()
@click.pass_context
def dummy_service(ctx: click.Context) -> None:
pass
# Handle proxy configuration
proxy_param = data.get("proxy")
no_proxy = data.get("no_proxy", False)
proxy_providers = []
if not no_proxy:
proxy_providers = initialize_proxy_providers()
if proxy_param and not no_proxy:
try:
resolved_proxy = resolve_proxy(proxy_param, proxy_providers)
proxy_param = resolved_proxy
except ValueError as e:
raise APIError(
APIErrorCode.INVALID_PROXY,
f"Proxy error: {e}",
details={"proxy": proxy_param, "service": normalized_service},
)
ctx = click.Context(dummy_service)
ctx.obj = ContextData(config=service_config, cdm=None, proxy_providers=proxy_providers, profile=profile)
ctx.params = {"proxy": proxy_param, "no_proxy": no_proxy}
service_module = Services.load(normalized_service)
dummy_service.name = normalized_service
dummy_service.params = [click.Argument([title_id], type=str)]
ctx.invoked_subcommand = normalized_service
service_ctx = click.Context(dummy_service, parent=ctx)
service_ctx.obj = ctx.obj
service_kwargs = {"title": title_id}
# Add additional parameters from request data
for key, value in data.items():
if key not in ["service", "title_id", "profile", "season", "episode", "wanted", "proxy", "no_proxy"]:
service_kwargs[key] = value
# Get service parameter info and click command defaults
service_init_params = inspect.signature(service_module.__init__).parameters
# Extract default values from the click command
if hasattr(service_module, "cli") and hasattr(service_module.cli, "params"):
for param in service_module.cli.params:
if hasattr(param, "name") and param.name not in service_kwargs:
# Add default value if parameter is not already provided
if hasattr(param, "default") and param.default is not None:
service_kwargs[param.name] = param.default
# Handle required parameters that don't have click defaults
for param_name, param_info in service_init_params.items():
if param_name not in service_kwargs and param_name not in ["self", "ctx"]:
# Check if parameter is required (no default value in signature)
if param_info.default is inspect.Parameter.empty:
# Provide sensible defaults for common required parameters
if param_name == "meta_lang":
service_kwargs[param_name] = None
elif param_name == "movie":
service_kwargs[param_name] = False
else:
# Log warning for unknown required parameters
log.warning(f"Unknown required parameter '{param_name}' for service {normalized_service}")
# Filter out any parameters that the service doesn't accept
filtered_kwargs = {}
for key, value in service_kwargs.items():
if key in service_init_params:
filtered_kwargs[key] = value
service_instance = service_module(service_ctx, **filtered_kwargs)
cookies = dl.get_cookie_jar(normalized_service, profile)
credential = dl.get_credentials(normalized_service, profile)
service_instance.authenticate(cookies, credential)
titles = service_instance.get_titles()
if hasattr(titles, "__iter__") and not isinstance(titles, str):
title_list = [serialize_title(t) for t in titles]
else:
title_list = [serialize_title(titles)]
return web.json_response({"titles": title_list})
except APIError:
raise
except Exception as e:
log.exception("Error listing titles")
debug_mode = request.app.get("debug_api", False) if request else False
return handle_api_exception(
e,
context={"operation": "list_titles", "service": normalized_service, "title_id": title_id},
debug_mode=debug_mode,
)
async def list_tracks_handler(data: Dict[str, Any], request: Optional[web.Request] = None) -> web.Response:
"""Handle list-tracks request."""
service_tag = data.get("service")
title_id = data.get("title_id")
profile = data.get("profile")
if not service_tag:
raise APIError(
APIErrorCode.INVALID_INPUT,
"Missing required parameter: service",
details={"missing_parameter": "service"},
)
if not title_id:
raise APIError(
APIErrorCode.INVALID_INPUT,
"Missing required parameter: title_id",
details={"missing_parameter": "title_id"},
)
normalized_service = validate_service(service_tag)
if not normalized_service:
raise APIError(
APIErrorCode.INVALID_SERVICE,
f"Invalid or unavailable service: {service_tag}",
details={"service": service_tag},
)
try:
import inspect
import click
import yaml
from unshackle.commands.dl import dl
from unshackle.core.config import config
from unshackle.core.utils.click_types import ContextData
from unshackle.core.utils.collections import merge_dict
service_config_path = Services.get_path(normalized_service) / config.filenames.config
if service_config_path.exists():
service_config = yaml.safe_load(service_config_path.read_text(encoding="utf8"))
else:
service_config = {}
merge_dict(config.services.get(normalized_service), service_config)
@click.command()
@click.pass_context
def dummy_service(ctx: click.Context) -> None:
pass
# Handle proxy configuration
proxy_param = data.get("proxy")
no_proxy = data.get("no_proxy", False)
proxy_providers = []
if not no_proxy:
proxy_providers = initialize_proxy_providers()
if proxy_param and not no_proxy:
try:
resolved_proxy = resolve_proxy(proxy_param, proxy_providers)
proxy_param = resolved_proxy
except ValueError as e:
raise APIError(
APIErrorCode.INVALID_PROXY,
f"Proxy error: {e}",
details={"proxy": proxy_param, "service": normalized_service},
)
ctx = click.Context(dummy_service)
ctx.obj = ContextData(config=service_config, cdm=None, proxy_providers=proxy_providers, profile=profile)
ctx.params = {"proxy": proxy_param, "no_proxy": no_proxy}
service_module = Services.load(normalized_service)
dummy_service.name = normalized_service
dummy_service.params = [click.Argument([title_id], type=str)]
ctx.invoked_subcommand = normalized_service
service_ctx = click.Context(dummy_service, parent=ctx)
service_ctx.obj = ctx.obj
service_kwargs = {"title": title_id}
# Add additional parameters from request data
for key, value in data.items():
if key not in ["service", "title_id", "profile", "season", "episode", "wanted", "proxy", "no_proxy"]:
service_kwargs[key] = value
# Get service parameter info and click command defaults
service_init_params = inspect.signature(service_module.__init__).parameters
# Extract default values from the click command
if hasattr(service_module, "cli") and hasattr(service_module.cli, "params"):
for param in service_module.cli.params:
if hasattr(param, "name") and param.name not in service_kwargs:
# Add default value if parameter is not already provided
if hasattr(param, "default") and param.default is not None:
service_kwargs[param.name] = param.default
# Handle required parameters that don't have click defaults
for param_name, param_info in service_init_params.items():
if param_name not in service_kwargs and param_name not in ["self", "ctx"]:
# Check if parameter is required (no default value in signature)
if param_info.default is inspect.Parameter.empty:
# Provide sensible defaults for common required parameters
if param_name == "meta_lang":
service_kwargs[param_name] = None
elif param_name == "movie":
service_kwargs[param_name] = False
else:
# Log warning for unknown required parameters
log.warning(f"Unknown required parameter '{param_name}' for service {normalized_service}")
# Filter out any parameters that the service doesn't accept
filtered_kwargs = {}
for key, value in service_kwargs.items():
if key in service_init_params:
filtered_kwargs[key] = value
service_instance = service_module(service_ctx, **filtered_kwargs)
cookies = dl.get_cookie_jar(normalized_service, profile)
credential = dl.get_credentials(normalized_service, profile)
service_instance.authenticate(cookies, credential)
titles = service_instance.get_titles()
wanted_param = data.get("wanted")
season = data.get("season")
episode = data.get("episode")
if hasattr(titles, "__iter__") and not isinstance(titles, str):
titles_list = list(titles)
wanted = None
if wanted_param:
from unshackle.core.utils.click_types import SeasonRange
try:
season_range = SeasonRange()
wanted = season_range.parse_tokens(wanted_param)
log.debug(f"Parsed wanted '{wanted_param}' into {len(wanted)} episodes: {wanted[:10]}...")
except Exception as e:
raise APIError(
APIErrorCode.INVALID_PARAMETERS,
f"Invalid wanted parameter: {e}",
details={"wanted": wanted_param, "service": normalized_service},
)
elif season is not None and episode is not None:
wanted = [f"{season}x{episode}"]
if wanted:
# Filter titles based on wanted episodes, similar to how dl.py does it
matching_titles = []
log.debug(f"Filtering {len(titles_list)} titles with {len(wanted)} wanted episodes")
for title in titles_list:
if isinstance(title, Episode):
episode_key = f"{title.season}x{title.number}"
if episode_key in wanted:
log.debug(f"Episode {episode_key} matches wanted list")
matching_titles.append(title)
else:
log.debug(f"Episode {episode_key} not in wanted list")
else:
matching_titles.append(title)
log.debug(f"Found {len(matching_titles)} matching titles")
if not matching_titles:
raise APIError(
APIErrorCode.NO_CONTENT,
"No episodes found matching wanted criteria",
details={
"service": normalized_service,
"title_id": title_id,
"wanted": wanted_param or f"{season}x{episode}",
},
)
# If multiple episodes match, return tracks for all episodes
if len(matching_titles) > 1 and all(isinstance(t, Episode) for t in matching_titles):
episodes_data = []
failed_episodes = []
# Sort matching titles by season and episode number for consistent ordering
sorted_titles = sorted(matching_titles, key=lambda t: (t.season, t.number))
for title in sorted_titles:
try:
tracks = service_instance.get_tracks(title)
video_tracks = sorted(tracks.videos, key=lambda t: t.bitrate or 0, reverse=True)
audio_tracks = sorted(tracks.audio, key=lambda t: t.bitrate or 0, reverse=True)
episode_data = {
"title": serialize_title(title),
"video": [serialize_video_track(t) for t in video_tracks],
"audio": [serialize_audio_track(t) for t in audio_tracks],
"subtitles": [serialize_subtitle_track(t) for t in tracks.subtitles],
}
episodes_data.append(episode_data)
log.debug(f"Successfully got tracks for {title.season}x{title.number}")
except SystemExit:
# Service calls sys.exit() for unavailable episodes - catch and skip
failed_episodes.append(f"S{title.season}E{title.number:02d}")
log.debug(f"Episode {title.season}x{title.number} not available, skipping")
continue
except Exception as e:
# Handle other errors gracefully
failed_episodes.append(f"S{title.season}E{title.number:02d}")
log.debug(f"Error getting tracks for {title.season}x{title.number}: {e}")
continue
if episodes_data:
response = {"episodes": episodes_data}
if failed_episodes:
response["unavailable_episodes"] = failed_episodes
return web.json_response(response)
else:
raise APIError(
APIErrorCode.NO_CONTENT,
f"No available episodes found. Unavailable: {', '.join(failed_episodes)}",
details={
"service": normalized_service,
"title_id": title_id,
"unavailable_episodes": failed_episodes,
},
)
else:
# Single episode or movie
first_title = matching_titles[0]
else:
first_title = titles_list[0]
else:
first_title = titles
tracks = service_instance.get_tracks(first_title)
video_tracks = sorted(tracks.videos, key=lambda t: t.bitrate or 0, reverse=True)
audio_tracks = sorted(tracks.audio, key=lambda t: t.bitrate or 0, reverse=True)
response = {
"title": serialize_title(first_title),
"video": [serialize_video_track(t) for t in video_tracks],
"audio": [serialize_audio_track(t) for t in audio_tracks],
"subtitles": [serialize_subtitle_track(t) for t in tracks.subtitles],
}
return web.json_response(response)
except APIError:
raise
except Exception as e:
log.exception("Error listing tracks")
debug_mode = request.app.get("debug_api", False) if request else False
return handle_api_exception(
e,
context={"operation": "list_tracks", "service": normalized_service, "title_id": title_id},
debug_mode=debug_mode,
)
def validate_download_parameters(data: Dict[str, Any]) -> Optional[str]:
"""
Validate download parameters and return error message if invalid.
Returns:
None if valid, error message string if invalid
"""
if "vcodec" in data and data["vcodec"]:
valid_vcodecs = ["H264", "H265", "VP9", "AV1"]
if data["vcodec"].upper() not in valid_vcodecs:
return f"Invalid vcodec: {data['vcodec']}. Must be one of: {', '.join(valid_vcodecs)}"
if "acodec" in data and data["acodec"]:
valid_acodecs = ["AAC", "AC3", "EAC3", "OPUS", "FLAC", "ALAC", "VORBIS", "DTS"]
if data["acodec"].upper() not in valid_acodecs:
return f"Invalid acodec: {data['acodec']}. Must be one of: {', '.join(valid_acodecs)}"
if "sub_format" in data and data["sub_format"]:
valid_sub_formats = ["SRT", "VTT", "ASS", "SSA"]
if data["sub_format"].upper() not in valid_sub_formats:
return f"Invalid sub_format: {data['sub_format']}. Must be one of: {', '.join(valid_sub_formats)}"
if "vbitrate" in data and data["vbitrate"] is not None:
if not isinstance(data["vbitrate"], int) or data["vbitrate"] <= 0:
return "vbitrate must be a positive integer"
if "abitrate" in data and data["abitrate"] is not None:
if not isinstance(data["abitrate"], int) or data["abitrate"] <= 0:
return "abitrate must be a positive integer"
if "channels" in data and data["channels"] is not None:
if not isinstance(data["channels"], (int, float)) or data["channels"] <= 0:
return "channels must be a positive number"
if "workers" in data and data["workers"] is not None:
if not isinstance(data["workers"], int) or data["workers"] <= 0:
return "workers must be a positive integer"
if "downloads" in data and data["downloads"] is not None:
if not isinstance(data["downloads"], int) or data["downloads"] <= 0:
return "downloads must be a positive integer"
exclusive_flags = []
if data.get("video_only"):
exclusive_flags.append("video_only")
if data.get("audio_only"):
exclusive_flags.append("audio_only")
if data.get("subs_only"):
exclusive_flags.append("subs_only")
if data.get("chapters_only"):
exclusive_flags.append("chapters_only")
if len(exclusive_flags) > 1:
return f"Cannot use multiple exclusive flags: {', '.join(exclusive_flags)}"
if data.get("no_subs") and data.get("subs_only"):
return "Cannot use both no_subs and subs_only"
if data.get("no_audio") and data.get("audio_only"):
return "Cannot use both no_audio and audio_only"
if data.get("s_lang") and data.get("require_subs"):
return "Cannot use both s_lang and require_subs"
if "range" in data and data["range"]:
valid_ranges = ["SDR", "HDR10", "HDR10+", "DV", "HLG"]
if isinstance(data["range"], list):
for r in data["range"]:
if r.upper() not in valid_ranges:
return f"Invalid range value: {r}. Must be one of: {', '.join(valid_ranges)}"
elif data["range"].upper() not in valid_ranges:
return f"Invalid range value: {data['range']}. Must be one of: {', '.join(valid_ranges)}"
return None
async def download_handler(data: Dict[str, Any], request: Optional[web.Request] = None) -> web.Response:
"""Handle download request - create and queue a download job."""
from unshackle.core.api.download_manager import get_download_manager
service_tag = data.get("service")
title_id = data.get("title_id")
if not service_tag:
raise APIError(
APIErrorCode.INVALID_INPUT,
"Missing required parameter: service",
details={"missing_parameter": "service"},
)
if not title_id:
raise APIError(
APIErrorCode.INVALID_INPUT,
"Missing required parameter: title_id",
details={"missing_parameter": "title_id"},
)
normalized_service = validate_service(service_tag)
if not normalized_service:
raise APIError(
APIErrorCode.INVALID_SERVICE,
f"Invalid or unavailable service: {service_tag}",
details={"service": service_tag},
)
validation_error = validate_download_parameters(data)
if validation_error:
raise APIError(
APIErrorCode.INVALID_PARAMETERS,
validation_error,
details={"service": normalized_service, "title_id": title_id},
)
try:
# Load service module to extract service-specific parameter defaults
service_module = Services.load(normalized_service)
service_specific_defaults = {}
# Extract default values from the service's click command
if hasattr(service_module, "cli") and hasattr(service_module.cli, "params"):
for param in service_module.cli.params:
if hasattr(param, "name") and hasattr(param, "default") and param.default is not None:
# Store service-specific defaults (e.g., drm_system, hydrate_track, profile for NF)
service_specific_defaults[param.name] = param.default
# Get download manager and start workers if needed
manager = get_download_manager()
await manager.start_workers()
# Create download job with filtered parameters (exclude service and title_id as they're already passed)
filtered_params = {k: v for k, v in data.items() if k not in ["service", "title_id"]}
# Merge defaults with provided parameters (user params override service defaults, which override global defaults)
params_with_defaults = {**DEFAULT_DOWNLOAD_PARAMS, **service_specific_defaults, **filtered_params}
job = manager.create_job(normalized_service, title_id, **params_with_defaults)
return web.json_response(
{"job_id": job.job_id, "status": job.status.value, "created_time": job.created_time.isoformat()}, status=202
)
except APIError:
raise
except Exception as e:
log.exception("Error creating download job")
debug_mode = request.app.get("debug_api", False) if request else False
return handle_api_exception(
e,
context={"operation": "create_download_job", "service": normalized_service, "title_id": title_id},
debug_mode=debug_mode,
)
async def list_download_jobs_handler(data: Dict[str, Any], request: Optional[web.Request] = None) -> web.Response:
"""Handle list download jobs request with optional filtering and sorting."""
from unshackle.core.api.download_manager import get_download_manager
try:
manager = get_download_manager()
jobs = manager.list_jobs()
status_filter = data.get("status")
if status_filter:
jobs = [job for job in jobs if job.status.value == status_filter]
service_filter = data.get("service")
if service_filter:
jobs = [job for job in jobs if job.service == service_filter]
sort_by = data.get("sort_by", "created_time")
sort_order = data.get("sort_order", "desc")
valid_sort_fields = ["created_time", "started_time", "completed_time", "progress", "status", "service"]
if sort_by not in valid_sort_fields:
raise APIError(
APIErrorCode.INVALID_PARAMETERS,
f"Invalid sort_by: {sort_by}. Must be one of: {', '.join(valid_sort_fields)}",
details={"sort_by": sort_by, "valid_values": valid_sort_fields},
)
if sort_order not in ["asc", "desc"]:
raise APIError(
APIErrorCode.INVALID_PARAMETERS,
"Invalid sort_order: must be 'asc' or 'desc'",
details={"sort_order": sort_order, "valid_values": ["asc", "desc"]},
)
reverse = sort_order == "desc"
def get_sort_key(job):
"""Get the sorting key value, handling None values."""
value = getattr(job, sort_by, None)
if value is None:
if sort_by in ["created_time", "started_time", "completed_time"]:
from datetime import datetime
return datetime.min if not reverse else datetime.max
elif sort_by == "progress":
return 0
elif sort_by in ["status", "service"]:
return ""
return value
jobs = sorted(jobs, key=get_sort_key, reverse=reverse)
job_list = [job.to_dict(include_full_details=False) for job in jobs]
return web.json_response({"jobs": job_list})
except APIError:
raise
except Exception as e:
log.exception("Error listing download jobs")
debug_mode = request.app.get("debug_api", False) if request else False
return handle_api_exception(
e,
context={"operation": "list_download_jobs"},
debug_mode=debug_mode,
)
async def get_download_job_handler(job_id: str, request: Optional[web.Request] = None) -> web.Response:
"""Handle get specific download job request."""
from unshackle.core.api.download_manager import get_download_manager
try:
manager = get_download_manager()
job = manager.get_job(job_id)
if not job:
raise APIError(
APIErrorCode.JOB_NOT_FOUND,
"Job not found",
details={"job_id": job_id},
)
return web.json_response(job.to_dict(include_full_details=True))
except APIError:
raise
except Exception as e:
log.exception(f"Error getting download job {job_id}")
debug_mode = request.app.get("debug_api", False) if request else False
return handle_api_exception(
e,
context={"operation": "get_download_job", "job_id": job_id},
debug_mode=debug_mode,
)
async def cancel_download_job_handler(job_id: str, request: Optional[web.Request] = None) -> web.Response:
"""Handle cancel download job request."""
from unshackle.core.api.download_manager import get_download_manager
try:
manager = get_download_manager()
if not manager.get_job(job_id):
raise APIError(
APIErrorCode.JOB_NOT_FOUND,
"Job not found",
details={"job_id": job_id},
)
success = manager.cancel_job(job_id)
if success:
return web.json_response({"status": "success", "message": "Job cancelled"})
else:
raise APIError(
APIErrorCode.INVALID_PARAMETERS,
"Job cannot be cancelled (already completed or failed)",
details={"job_id": job_id},
)
except APIError:
raise
except Exception as e:
log.exception(f"Error cancelling download job {job_id}")
debug_mode = request.app.get("debug_api", False) if request else False
return handle_api_exception(
e,
context={"operation": "cancel_download_job", "job_id": job_id},
debug_mode=debug_mode,
)

View File

@ -0,0 +1,758 @@
import logging
import re
from aiohttp import web
from aiohttp_swagger3 import SwaggerDocs, SwaggerInfo, SwaggerUiSettings
from unshackle.core import __version__
from unshackle.core.api.errors import APIError, APIErrorCode, build_error_response, handle_api_exception
from unshackle.core.api.handlers import (cancel_download_job_handler, download_handler, get_download_job_handler,
list_download_jobs_handler, list_titles_handler, list_tracks_handler)
from unshackle.core.services import Services
from unshackle.core.update_checker import UpdateChecker
@web.middleware
async def cors_middleware(request: web.Request, handler):
"""Add CORS headers to all responses."""
# Handle preflight requests
if request.method == "OPTIONS":
response = web.Response()
else:
response = await handler(request)
# Add CORS headers
response.headers["Access-Control-Allow-Origin"] = "*"
response.headers["Access-Control-Allow-Methods"] = "GET, POST, PUT, DELETE, OPTIONS"
response.headers["Access-Control-Allow-Headers"] = "Content-Type, X-API-Key, Authorization"
response.headers["Access-Control-Max-Age"] = "3600"
return response
log = logging.getLogger("api")
async def health(request: web.Request) -> web.Response:
"""
Health check endpoint.
---
summary: Health check
description: Get server health status, version info, and update availability
responses:
'200':
description: Health status
content:
application/json:
schema:
type: object
properties:
status:
type: string
example: ok
version:
type: string
example: "2.0.0"
update_check:
type: object
properties:
update_available:
type: boolean
nullable: true
current_version:
type: string
latest_version:
type: string
nullable: true
"""
try:
latest_version = await UpdateChecker.check_for_updates(__version__)
update_info = {
"update_available": latest_version is not None,
"current_version": __version__,
"latest_version": latest_version,
}
except Exception as e:
log.warning(f"Failed to check for updates: {e}")
update_info = {"update_available": None, "current_version": __version__, "latest_version": None}
return web.json_response({"status": "ok", "version": __version__, "update_check": update_info})
async def services(request: web.Request) -> web.Response:
"""
List available services.
---
summary: List services
description: Get all available streaming services with their details
responses:
'200':
description: List of services
content:
application/json:
schema:
type: object
properties:
services:
type: array
items:
type: object
properties:
tag:
type: string
aliases:
type: array
items:
type: string
geofence:
type: array
items:
type: string
title_regex:
oneOf:
- type: string
- type: array
items:
type: string
nullable: true
url:
type: string
nullable: true
description: Service URL from short_help
help:
type: string
nullable: true
description: Full service documentation
'500':
description: Server error
content:
application/json:
schema:
type: object
properties:
status:
type: string
example: error
error_code:
type: string
example: INTERNAL_ERROR
message:
type: string
example: An unexpected error occurred
details:
type: object
timestamp:
type: string
format: date-time
debug_info:
type: object
description: Only present when --debug-api flag is enabled
"""
try:
service_tags = Services.get_tags()
services_info = []
for tag in service_tags:
service_data = {"tag": tag, "aliases": [], "geofence": [], "title_regex": None, "url": None, "help": None}
try:
service_module = Services.load(tag)
if hasattr(service_module, "ALIASES"):
service_data["aliases"] = list(service_module.ALIASES)
if hasattr(service_module, "GEOFENCE"):
service_data["geofence"] = list(service_module.GEOFENCE)
if hasattr(service_module, "TITLE_RE"):
title_re = service_module.TITLE_RE
# Handle different types of TITLE_RE
if isinstance(title_re, re.Pattern):
service_data["title_regex"] = title_re.pattern
elif isinstance(title_re, str):
service_data["title_regex"] = title_re
elif isinstance(title_re, (list, tuple)):
# Convert list/tuple of patterns to list of strings
patterns = []
for item in title_re:
if isinstance(item, re.Pattern):
patterns.append(item.pattern)
elif isinstance(item, str):
patterns.append(item)
service_data["title_regex"] = patterns if patterns else None
if hasattr(service_module, "cli") and hasattr(service_module.cli, "short_help"):
service_data["url"] = service_module.cli.short_help
if service_module.__doc__:
service_data["help"] = service_module.__doc__.strip()
except Exception as e:
log.warning(f"Could not load details for service {tag}: {e}")
services_info.append(service_data)
return web.json_response({"services": services_info})
except Exception as e:
log.exception("Error listing services")
debug_mode = request.app.get("debug_api", False)
return handle_api_exception(e, context={"operation": "list_services"}, debug_mode=debug_mode)
async def list_titles(request: web.Request) -> web.Response:
"""
List titles for a service and title ID.
---
summary: List titles
description: Get available titles for a service and title ID
requestBody:
required: true
content:
application/json:
schema:
type: object
required:
- service
- title_id
properties:
service:
type: string
description: Service tag
title_id:
type: string
description: Title identifier
responses:
'200':
description: List of titles
'400':
description: Invalid request (missing parameters, invalid service)
content:
application/json:
schema:
type: object
properties:
status:
type: string
example: error
error_code:
type: string
example: INVALID_INPUT
message:
type: string
example: Missing required parameter
details:
type: object
timestamp:
type: string
format: date-time
'401':
description: Authentication failed
content:
application/json:
schema:
type: object
properties:
status:
type: string
example: error
error_code:
type: string
example: AUTH_FAILED
message:
type: string
details:
type: object
timestamp:
type: string
format: date-time
'404':
description: Title not found
content:
application/json:
schema:
type: object
properties:
status:
type: string
example: error
error_code:
type: string
example: NOT_FOUND
message:
type: string
details:
type: object
timestamp:
type: string
format: date-time
'500':
description: Server error
content:
application/json:
schema:
type: object
properties:
status:
type: string
example: error
error_code:
type: string
example: INTERNAL_ERROR
message:
type: string
details:
type: object
timestamp:
type: string
format: date-time
"""
try:
data = await request.json()
except Exception as e:
return build_error_response(
APIError(
APIErrorCode.INVALID_INPUT,
"Invalid JSON request body",
details={"error": str(e)},
),
request.app.get("debug_api", False),
)
try:
return await list_titles_handler(data, request)
except APIError as e:
debug_mode = request.app.get("debug_api", False)
return build_error_response(e, debug_mode)
async def list_tracks(request: web.Request) -> web.Response:
"""
List tracks for a title, separated by type.
---
summary: List tracks
description: Get available video, audio, and subtitle tracks for a title
requestBody:
required: true
content:
application/json:
schema:
type: object
required:
- service
- title_id
properties:
service:
type: string
description: Service tag
title_id:
type: string
description: Title identifier
wanted:
type: string
description: Specific episode/season (optional)
proxy:
type: string
description: Proxy configuration (optional)
responses:
'200':
description: Track information
'400':
description: Invalid request
"""
try:
data = await request.json()
except Exception as e:
return build_error_response(
APIError(
APIErrorCode.INVALID_INPUT,
"Invalid JSON request body",
details={"error": str(e)},
),
request.app.get("debug_api", False),
)
try:
return await list_tracks_handler(data, request)
except APIError as e:
debug_mode = request.app.get("debug_api", False)
return build_error_response(e, debug_mode)
async def download(request: web.Request) -> web.Response:
"""
Download content based on provided parameters.
---
summary: Download content
description: Download video content based on specified parameters
requestBody:
required: true
content:
application/json:
schema:
type: object
required:
- service
- title_id
properties:
service:
type: string
description: Service tag
title_id:
type: string
description: Title identifier
profile:
type: string
description: Profile to use for credentials and cookies (default - None)
quality:
type: array
items:
type: integer
description: Download resolution(s) (default - best available)
vcodec:
type: string
description: Video codec to download (e.g., H264, H265, VP9, AV1) (default - None)
acodec:
type: string
description: Audio codec to download (e.g., AAC, AC3, EAC3) (default - None)
vbitrate:
type: integer
description: Video bitrate in kbps (default - None)
abitrate:
type: integer
description: Audio bitrate in kbps (default - None)
range:
type: array
items:
type: string
description: Video color range (SDR, HDR10, DV) (default - ["SDR"])
channels:
type: number
description: Audio channels (e.g., 2.0, 5.1, 7.1) (default - None)
no_atmos:
type: boolean
description: Exclude Dolby Atmos audio tracks (default - false)
wanted:
type: array
items:
type: string
description: Wanted episodes (e.g., ["S01E01", "S01E02"]) (default - all)
latest_episode:
type: boolean
description: Download only the single most recent episode (default - false)
lang:
type: array
items:
type: string
description: Language for video and audio (use 'orig' for original) (default - ["orig"])
v_lang:
type: array
items:
type: string
description: Language for video tracks only (default - [])
a_lang:
type: array
items:
type: string
description: Language for audio tracks only (default - [])
s_lang:
type: array
items:
type: string
description: Language for subtitle tracks (default - ["all"])
require_subs:
type: array
items:
type: string
description: Required subtitle languages (default - [])
forced_subs:
type: boolean
description: Include forced subtitle tracks (default - false)
exact_lang:
type: boolean
description: Use exact language matching (no variants) (default - false)
sub_format:
type: string
description: Output subtitle format (SRT, VTT, etc.) (default - None)
video_only:
type: boolean
description: Only download video tracks (default - false)
audio_only:
type: boolean
description: Only download audio tracks (default - false)
subs_only:
type: boolean
description: Only download subtitle tracks (default - false)
chapters_only:
type: boolean
description: Only download chapters (default - false)
no_subs:
type: boolean
description: Do not download subtitle tracks (default - false)
no_audio:
type: boolean
description: Do not download audio tracks (default - false)
no_chapters:
type: boolean
description: Do not download chapters (default - false)
audio_description:
type: boolean
description: Download audio description tracks (default - false)
slow:
type: boolean
description: Add 60-120s delay between downloads (default - false)
skip_dl:
type: boolean
description: Skip downloading, only retrieve decryption keys (default - false)
export:
type: string
description: Path to export decryption keys as JSON (default - None)
cdm_only:
type: boolean
description: Only use CDM for key retrieval (true) or only vaults (false) (default - None)
proxy:
type: string
description: Proxy URI or country code (default - None)
no_proxy:
type: boolean
description: Force disable all proxy use (default - false)
tag:
type: string
description: Set the group tag to be used (default - None)
tmdb_id:
type: integer
description: Use this TMDB ID for tagging (default - None)
tmdb_name:
type: boolean
description: Rename titles using TMDB name (default - false)
tmdb_year:
type: boolean
description: Use release year from TMDB (default - false)
no_folder:
type: boolean
description: Disable folder creation for TV shows (default - false)
no_source:
type: boolean
description: Disable source tag from output file name (default - false)
no_mux:
type: boolean
description: Do not mux tracks into a container file (default - false)
workers:
type: integer
description: Max workers/threads per track download (default - None)
downloads:
type: integer
description: Amount of tracks to download concurrently (default - 1)
best_available:
type: boolean
description: Continue with best available if requested quality unavailable (default - false)
responses:
'202':
description: Download job created
content:
application/json:
schema:
type: object
properties:
job_id:
type: string
status:
type: string
created_time:
type: string
'400':
description: Invalid request
"""
try:
data = await request.json()
except Exception as e:
return build_error_response(
APIError(
APIErrorCode.INVALID_INPUT,
"Invalid JSON request body",
details={"error": str(e)},
),
request.app.get("debug_api", False),
)
try:
return await download_handler(data, request)
except APIError as e:
debug_mode = request.app.get("debug_api", False)
return build_error_response(e, debug_mode)
async def download_jobs(request: web.Request) -> web.Response:
"""
List all download jobs with optional filtering and sorting.
---
summary: List download jobs
description: Get list of all download jobs with their status, with optional filtering by status/service and sorting
parameters:
- name: status
in: query
required: false
schema:
type: string
enum: [queued, downloading, completed, failed, cancelled]
description: Filter jobs by status
- name: service
in: query
required: false
schema:
type: string
description: Filter jobs by service tag
- name: sort_by
in: query
required: false
schema:
type: string
enum: [created_time, started_time, completed_time, progress, status, service]
default: created_time
description: Field to sort by
- name: sort_order
in: query
required: false
schema:
type: string
enum: [asc, desc]
default: desc
description: Sort order (ascending or descending)
responses:
'200':
description: List of download jobs
content:
application/json:
schema:
type: object
properties:
jobs:
type: array
items:
type: object
properties:
job_id:
type: string
status:
type: string
created_time:
type: string
service:
type: string
title_id:
type: string
progress:
type: number
'400':
description: Invalid query parameters
'500':
description: Server error
"""
# Extract query parameters
query_params = {
"status": request.query.get("status"),
"service": request.query.get("service"),
"sort_by": request.query.get("sort_by", "created_time"),
"sort_order": request.query.get("sort_order", "desc"),
}
try:
return await list_download_jobs_handler(query_params, request)
except APIError as e:
debug_mode = request.app.get("debug_api", False)
return build_error_response(e, debug_mode)
async def download_job_detail(request: web.Request) -> web.Response:
"""
Get download job details.
---
summary: Get download job
description: Get detailed information about a specific download job
parameters:
- name: job_id
in: path
required: true
schema:
type: string
responses:
'200':
description: Download job details
'404':
description: Job not found
'500':
description: Server error
"""
job_id = request.match_info["job_id"]
try:
return await get_download_job_handler(job_id, request)
except APIError as e:
debug_mode = request.app.get("debug_api", False)
return build_error_response(e, debug_mode)
async def cancel_download_job(request: web.Request) -> web.Response:
"""
Cancel download job.
---
summary: Cancel download job
description: Cancel a queued or running download job
parameters:
- name: job_id
in: path
required: true
schema:
type: string
responses:
'200':
description: Job cancelled successfully
'400':
description: Job cannot be cancelled
'404':
description: Job not found
'500':
description: Server error
"""
job_id = request.match_info["job_id"]
try:
return await cancel_download_job_handler(job_id, request)
except APIError as e:
debug_mode = request.app.get("debug_api", False)
return build_error_response(e, debug_mode)
def setup_routes(app: web.Application) -> None:
"""Setup all API routes."""
app.router.add_get("/api/health", health)
app.router.add_get("/api/services", services)
app.router.add_post("/api/list-titles", list_titles)
app.router.add_post("/api/list-tracks", list_tracks)
app.router.add_post("/api/download", download)
app.router.add_get("/api/download/jobs", download_jobs)
app.router.add_get("/api/download/jobs/{job_id}", download_job_detail)
app.router.add_delete("/api/download/jobs/{job_id}", cancel_download_job)
def setup_swagger(app: web.Application) -> None:
"""Setup Swagger UI documentation."""
swagger = SwaggerDocs(
app,
swagger_ui_settings=SwaggerUiSettings(path="/api/docs/"),
info=SwaggerInfo(
title="Unshackle REST API",
version=__version__,
description="REST API for Unshackle - Modular Movie, TV, and Music Archival Software",
),
)
# Add routes with OpenAPI documentation
swagger.add_routes(
[
web.get("/api/health", health),
web.get("/api/services", services),
web.post("/api/list-titles", list_titles),
web.post("/api/list-tracks", list_tracks),
web.post("/api/download", download),
web.get("/api/download/jobs", download_jobs),
web.get("/api/download/jobs/{job_id}", download_job_detail),
web.delete("/api/download/jobs/{job_id}", cancel_download_job),
]
)

View File

@ -0,0 +1,75 @@
import shutil
import sys
from pathlib import Path
from typing import Optional
__shaka_platform = {"win32": "win", "darwin": "osx"}.get(sys.platform, sys.platform)
def find(*names: str) -> Optional[Path]:
"""Find the path of the first found binary name."""
current_dir = Path(__file__).resolve().parent.parent
local_binaries_dir = current_dir / "binaries"
ext = ".exe" if sys.platform == "win32" else ""
for name in names:
if local_binaries_dir.exists():
candidate_paths = [local_binaries_dir / f"{name}{ext}", local_binaries_dir / name / f"{name}{ext}"]
for path in candidate_paths:
if path.is_file():
# On Unix-like systems, check if file is executable
if sys.platform == "win32" or (path.stat().st_mode & 0o111):
return path
# Fall back to system PATH
path = shutil.which(name)
if path:
return Path(path)
return None
FFMPEG = find("ffmpeg")
FFProbe = find("ffprobe")
FFPlay = find("ffplay")
SubtitleEdit = find("SubtitleEdit")
ShakaPackager = find(
"shaka-packager",
"packager",
f"packager-{__shaka_platform}",
f"packager-{__shaka_platform}-arm64",
f"packager-{__shaka_platform}-x64",
)
Aria2 = find("aria2c", "aria2")
CCExtractor = find("ccextractor", "ccextractorwin", "ccextractorwinfull")
HolaProxy = find("hola-proxy")
MPV = find("mpv")
Caddy = find("caddy")
N_m3u8DL_RE = find("N_m3u8DL-RE", "n-m3u8dl-re")
MKVToolNix = find("mkvmerge")
Mkvpropedit = find("mkvpropedit")
DoviTool = find("dovi_tool")
HDR10PlusTool = find("hdr10plus_tool", "HDR10Plus_tool")
Mp4decrypt = find("mp4decrypt")
__all__ = (
"FFMPEG",
"FFProbe",
"FFPlay",
"SubtitleEdit",
"ShakaPackager",
"Aria2",
"CCExtractor",
"HolaProxy",
"MPV",
"Caddy",
"N_m3u8DL_RE",
"MKVToolNix",
"Mkvpropedit",
"DoviTool",
"HDR10PlusTool",
"Mp4decrypt",
"find",
)

156
unshackle/core/cacher.py Normal file
View File

@ -0,0 +1,156 @@
from __future__ import annotations
import zlib
from datetime import datetime, timedelta
from os import stat_result
from pathlib import Path
from typing import Any, Optional, Union
import jsonpickle
import jwt
from unshackle.core.config import config
EXP_T = Union[datetime, str, int, float]
class Cacher:
"""Cacher for Services to get and set arbitrary data with expiration dates."""
def __init__(
self,
service_tag: str,
key: Optional[str] = None,
version: Optional[int] = 1,
data: Optional[Any] = None,
expiration: Optional[datetime] = None,
) -> None:
self.service_tag = service_tag
self.key = key
self.version = version
self.data = data or {}
self.expiration = expiration
if self.expiration and self.expired:
# if its expired, remove the data for safety and delete cache file
self.data = None
self.path.unlink()
def __bool__(self) -> bool:
return bool(self.data)
@property
def path(self) -> Path:
"""Get the path at which the cache will be read and written."""
return (config.directories.cache / self.service_tag / self.key).with_suffix(".json")
@property
def expired(self) -> bool:
return self.expiration and self.expiration < datetime.now()
def get(self, key: str, version: int = 1) -> Cacher:
"""
Get Cached data for the Service by Key.
:param key: the filename to save the data to, should be url-safe.
:param version: the config data version you expect to use.
:returns: Cache object containing the cached data or None if the file does not exist.
"""
cache = Cacher(self.service_tag, key, version)
if cache.path.is_file():
data = jsonpickle.loads(cache.path.read_text(encoding="utf8"))
payload = data.copy()
del payload["crc32"]
checksum = data["crc32"]
calculated = zlib.crc32(jsonpickle.dumps(payload).encode("utf8"))
if calculated != checksum:
raise ValueError(
f"The checksum of the Cache payload mismatched. Checksum: {checksum} !== Calculated: {calculated}"
)
cache.data = data["data"]
cache.expiration = data["expiration"]
cache.version = data["version"]
if cache.version != version:
raise ValueError(
f"The version of your {self.service_tag} {key} cache is outdated. Please delete: {cache.path}"
)
return cache
def set(self, data: Any, expiration: Optional[EXP_T] = None) -> Any:
"""
Set Cached data for the Service by Key.
:param data: absolutely anything including None.
:param expiration: when the data expires, optional. Can be ISO 8601, seconds
til expiration, unix timestamp, or a datetime object.
:returns: the data provided for quick wrapping of functions or vars.
"""
self.data = data
if not expiration:
try:
expiration = jwt.decode(self.data, options={"verify_signature": False})["exp"]
except jwt.DecodeError:
pass
self.expiration = self.resolve_datetime(expiration) if expiration else None
payload = {"data": self.data, "expiration": self.expiration, "version": self.version}
payload["crc32"] = zlib.crc32(jsonpickle.dumps(payload).encode("utf8"))
self.path.parent.mkdir(parents=True, exist_ok=True)
self.path.write_text(jsonpickle.dumps(payload))
return self.data
def stat(self) -> stat_result:
"""
Get Cache file OS Stat data like Creation Time, Modified Time, and such.
:returns: an os.stat_result tuple
"""
return self.path.stat()
@staticmethod
def resolve_datetime(timestamp: EXP_T) -> datetime:
"""
Resolve multiple formats of a Datetime or Timestamp to an absolute Datetime.
Examples:
>>> now = datetime.now()
datetime.datetime(2022, 6, 27, 9, 49, 13, 657208)
>>> iso8601 = now.isoformat()
'2022-06-27T09:49:13.657208'
>>> Cacher.resolve_datetime(iso8601)
datetime.datetime(2022, 6, 27, 9, 49, 13, 657208)
>>> Cacher.resolve_datetime(iso8601 + "Z")
datetime.datetime(2022, 6, 27, 9, 49, 13, 657208)
>>> Cacher.resolve_datetime(3600)
datetime.datetime(2022, 6, 27, 10, 52, 50, 657208)
>>> Cacher.resolve_datetime('3600')
datetime.datetime(2022, 6, 27, 10, 52, 51, 657208)
>>> Cacher.resolve_datetime(7800.113)
datetime.datetime(2022, 6, 27, 11, 59, 13, 770208)
In the int/float examples you may notice that it did not return now + 3600 seconds
but rather something a bit more than that. This is because it did not resolve 3600
seconds from the `now` variable but from right now as the function was called.
"""
if isinstance(timestamp, datetime):
return timestamp
if isinstance(timestamp, str):
if timestamp.endswith("Z"):
# fromisoformat doesn't accept the final Z
timestamp = timestamp.split("Z")[0]
try:
return datetime.fromisoformat(timestamp)
except ValueError:
timestamp = float(timestamp)
try:
if len(str(int(timestamp))) == 13: # JS-style timestamp
timestamp /= 1000
timestamp = datetime.fromtimestamp(timestamp)
except ValueError:
raise ValueError(f"Unrecognized Timestamp value {timestamp!r}")
if timestamp < datetime.now():
# timestamp is likely an amount of seconds til expiration
# or, it's an already expired timestamp which is unlikely
timestamp = timestamp + timedelta(seconds=datetime.now().timestamp())
return timestamp

View File

@ -0,0 +1,4 @@
from .custom_remote_cdm import CustomRemoteCDM
from .decrypt_labs_remote_cdm import DecryptLabsRemoteCDM
__all__ = ["DecryptLabsRemoteCDM", "CustomRemoteCDM"]

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,749 @@
from __future__ import annotations
import base64
import secrets
from typing import Any, Dict, List, Optional, Union
from uuid import UUID
import requests
from pywidevine.cdm import Cdm as WidevineCdm
from pywidevine.device import DeviceTypes
from requests import Session
from unshackle.core import __version__
from unshackle.core.vaults import Vaults
class MockCertificateChain:
"""Mock certificate chain for PlayReady compatibility."""
def __init__(self, name: str):
self._name = name
def get_name(self) -> str:
return self._name
class Key:
"""Key object compatible with pywidevine."""
def __init__(self, kid: str, key: str, type_: str = "CONTENT"):
if isinstance(kid, str):
clean_kid = kid.replace("-", "")
if len(clean_kid) == 32:
self.kid = UUID(hex=clean_kid)
else:
self.kid = UUID(hex=clean_kid.ljust(32, "0"))
else:
self.kid = kid
if isinstance(key, str):
self.key = bytes.fromhex(key)
else:
self.key = key
self.type = type_
class DecryptLabsRemoteCDMExceptions:
"""Exception classes for compatibility with pywidevine CDM."""
class InvalidSession(Exception):
"""Raised when session ID is invalid."""
class TooManySessions(Exception):
"""Raised when session limit is reached."""
class InvalidInitData(Exception):
"""Raised when PSSH/init data is invalid."""
class InvalidLicenseType(Exception):
"""Raised when license type is invalid."""
class InvalidLicenseMessage(Exception):
"""Raised when license message is invalid."""
class InvalidContext(Exception):
"""Raised when session has no context data."""
class SignatureMismatch(Exception):
"""Raised when signature verification fails."""
class DecryptLabsRemoteCDM:
"""
Decrypt Labs Remote CDM implementation with intelligent caching system.
This class provides a drop-in replacement for pywidevine's local CDM using
Decrypt Labs' KeyXtractor API service, enhanced with smart caching logic
that minimizes unnecessary license requests.
Key Features:
- Compatible with both Widevine and PlayReady DRM schemes
- Intelligent caching that compares required vs. available keys
- Optimized caching for L1/L2 devices (leverages API auto-optimization)
- Automatic key combination for mixed cache/license scenarios
- Seamless fallback to license requests when keys are missing
Intelligent Caching System:
1. DRM classes (PlayReady/Widevine) provide required KIDs via set_required_kids()
2. get_license_challenge() first checks for cached keys
3. For L1/L2 devices, always attempts cached keys first (API optimized)
4. If cached keys satisfy requirements, returns empty challenge (no license needed)
5. If keys are missing, makes targeted license request for remaining keys
6. parse_license() combines cached and license keys intelligently
"""
service_certificate_challenge = b"\x08\x04"
def __init__(
self,
secret: str,
host: str = "https://keyxtractor.decryptlabs.com",
device_name: str = "ChromeCDM",
service_name: Optional[str] = None,
vaults: Optional[Vaults] = None,
device_type: Optional[str] = None,
system_id: Optional[int] = None,
security_level: Optional[int] = None,
**kwargs,
):
"""
Initialize Decrypt Labs Remote CDM for Widevine and PlayReady schemes.
Args:
secret: Decrypt Labs API key (matches config format)
host: Decrypt Labs API host URL (matches config format)
device_name: DRM scheme (ChromeCDM, L1, L2 for Widevine; SL2, SL3 for PlayReady)
service_name: Service name for key caching and vault operations
vaults: Vaults instance for local key caching
device_type: Device type (CHROME, ANDROID, PLAYREADY) - for compatibility
system_id: System ID - for compatibility
security_level: Security level - for compatibility
"""
_ = kwargs
self.secret = secret
self.host = host.rstrip("/")
self.device_name = device_name
self.service_name = service_name or ""
self.vaults = vaults
self.uch = self.host != "https://keyxtractor.decryptlabs.com"
self._device_type_str = device_type
if device_type:
self.device_type = self._get_device_type_enum(device_type)
self._is_playready = (device_type and device_type.upper() == "PLAYREADY") or (device_name in ["SL2", "SL3"])
if self._is_playready:
self.system_id = system_id or 0
self.security_level = security_level or (2000 if device_name == "SL2" else 3000)
else:
self.system_id = system_id or 26830
self.security_level = security_level or 3
self._sessions: Dict[bytes, Dict[str, Any]] = {}
self._pssh_b64 = None
self._required_kids: Optional[List[str]] = None
self._http_session = Session()
self._http_session.headers.update(
{
"decrypt-labs-api-key": self.secret,
"Content-Type": "application/json",
"User-Agent": f"unshackle-decrypt-labs-cdm/{__version__}",
}
)
def _get_device_type_enum(self, device_type: str):
"""Convert device type string to enum for compatibility."""
device_type_upper = device_type.upper()
if device_type_upper == "ANDROID":
return DeviceTypes.ANDROID
elif device_type_upper == "CHROME":
return DeviceTypes.CHROME
else:
return DeviceTypes.CHROME
@property
def is_playready(self) -> bool:
"""Check if this CDM is in PlayReady mode."""
return self._is_playready
@property
def certificate_chain(self) -> MockCertificateChain:
"""Mock certificate chain for PlayReady compatibility."""
return MockCertificateChain(f"{self.device_name}_Remote")
def set_pssh_b64(self, pssh_b64: str) -> None:
"""Store base64-encoded PSSH data for PlayReady compatibility."""
self._pssh_b64 = pssh_b64
def set_required_kids(self, kids: List[Union[str, UUID]]) -> None:
"""
Set the required Key IDs for intelligent caching decisions.
This method enables the CDM to make smart decisions about when to request
additional keys via license challenges. When cached keys are available,
the CDM will compare them against the required KIDs to determine if a
license request is still needed for missing keys.
Args:
kids: List of required Key IDs as UUIDs or hex strings
Note:
Should be called by DRM classes (PlayReady/Widevine) before making
license challenge requests to enable optimal caching behavior.
"""
self._required_kids = []
for kid in kids:
if isinstance(kid, UUID):
self._required_kids.append(str(kid).replace("-", "").lower())
else:
self._required_kids.append(str(kid).replace("-", "").lower())
def _generate_session_id(self) -> bytes:
"""Generate a unique session ID."""
return secrets.token_bytes(16)
def _get_init_data_from_pssh(self, pssh: Any) -> str:
"""Extract init data from various PSSH formats."""
if self.is_playready and self._pssh_b64:
return self._pssh_b64
if hasattr(pssh, "dumps"):
dumps_result = pssh.dumps()
if isinstance(dumps_result, str):
try:
base64.b64decode(dumps_result)
return dumps_result
except Exception:
return base64.b64encode(dumps_result.encode("utf-8")).decode("utf-8")
else:
return base64.b64encode(dumps_result).decode("utf-8")
elif hasattr(pssh, "raw"):
raw_data = pssh.raw
if isinstance(raw_data, str):
raw_data = raw_data.encode("utf-8")
return base64.b64encode(raw_data).decode("utf-8")
elif hasattr(pssh, "__class__") and "WrmHeader" in pssh.__class__.__name__:
if self.is_playready:
raise ValueError("PlayReady WRM header received but no PSSH B64 was set via set_pssh_b64()")
if hasattr(pssh, "raw_bytes"):
return base64.b64encode(pssh.raw_bytes).decode("utf-8")
elif hasattr(pssh, "bytes"):
return base64.b64encode(pssh.bytes).decode("utf-8")
else:
raise ValueError(f"Cannot extract PSSH data from WRM header type: {type(pssh)}")
else:
raise ValueError(f"Unsupported PSSH type: {type(pssh)}")
def open(self) -> bytes:
"""
Open a new CDM session.
Returns:
Session identifier as bytes
"""
session_id = self._generate_session_id()
self._sessions[session_id] = {
"service_certificate": None,
"keys": [],
"pssh": None,
"challenge": None,
"decrypt_labs_session_id": None,
"tried_cache": False,
"cached_keys": None,
}
return session_id
def close(self, session_id: bytes) -> None:
"""
Close a CDM session and perform comprehensive cleanup.
Args:
session_id: Session identifier
Raises:
ValueError: If session ID is invalid
"""
if session_id not in self._sessions:
raise DecryptLabsRemoteCDMExceptions.InvalidSession(f"Invalid session ID: {session_id.hex()}")
session = self._sessions[session_id]
session.clear()
del self._sessions[session_id]
def get_service_certificate(self, session_id: bytes) -> Optional[bytes]:
"""
Get the service certificate for a session.
Args:
session_id: Session identifier
Returns:
Service certificate if set, None otherwise
Raises:
ValueError: If session ID is invalid
"""
if session_id not in self._sessions:
raise DecryptLabsRemoteCDMExceptions.InvalidSession(f"Invalid session ID: {session_id.hex()}")
return self._sessions[session_id]["service_certificate"]
def set_service_certificate(self, session_id: bytes, certificate: Optional[Union[bytes, str]]) -> str:
"""
Set the service certificate for a session.
Args:
session_id: Session identifier
certificate: Service certificate (bytes or base64 string)
Returns:
Certificate status message
Raises:
ValueError: If session ID is invalid
"""
if session_id not in self._sessions:
raise DecryptLabsRemoteCDMExceptions.InvalidSession(f"Invalid session ID: {session_id.hex()}")
if certificate is None:
if not self._is_playready and self.device_name == "L1":
certificate = WidevineCdm.common_privacy_cert
self._sessions[session_id]["service_certificate"] = base64.b64decode(certificate)
return "Using default Widevine common privacy certificate for L1"
else:
self._sessions[session_id]["service_certificate"] = None
return "No certificate set (not required for this device type)"
if isinstance(certificate, str):
certificate = base64.b64decode(certificate)
self._sessions[session_id]["service_certificate"] = certificate
return "Successfully set Service Certificate"
def has_cached_keys(self, session_id: bytes) -> bool:
"""
Check if cached keys are available for the session.
Args:
session_id: Session identifier
Returns:
True if cached keys are available
Raises:
ValueError: If session ID is invalid
"""
if session_id not in self._sessions:
raise DecryptLabsRemoteCDMExceptions.InvalidSession(f"Invalid session ID: {session_id.hex()}")
session = self._sessions[session_id]
session_keys = session.get("keys", [])
return len(session_keys) > 0
def get_license_challenge(
self, session_id: bytes, pssh_or_wrm: Any, license_type: str = "STREAMING", privacy_mode: bool = True
) -> bytes:
"""
Generate a license challenge using Decrypt Labs API with intelligent caching.
This method implements smart caching logic that:
1. First checks local vaults for required keys
2. Attempts to retrieve cached keys from the API
3. If required KIDs are set, compares available keys (vault + cached) against requirements
4. Only makes a license request if keys are missing
5. Returns empty challenge if all required keys are available
The intelligent caching works as follows:
- Local vaults: Always checked first if available
- For L1/L2 devices: Always prioritizes cached keys (API automatically optimizes)
- For other devices: Uses cache retry logic based on session state
- With required KIDs set: Only requests license for missing keys
- Without required KIDs: Returns any available cached keys
- For PlayReady: Combines vault, cached, and license keys seamlessly
Args:
session_id: Session identifier
pssh_or_wrm: PSSH object or WRM header (for PlayReady compatibility)
license_type: Type of license (STREAMING, OFFLINE, AUTOMATIC) - for compatibility only
privacy_mode: Whether to use privacy mode - for compatibility only
Returns:
License challenge as bytes, or empty bytes if available keys satisfy requirements
Raises:
InvalidSession: If session ID is invalid
requests.RequestException: If API request fails
Note:
Call set_required_kids() before this method for optimal caching behavior.
L1/L2 devices automatically use cached keys when available per API design.
Local vault keys are always checked first when vaults are available.
"""
_ = license_type, privacy_mode
if session_id not in self._sessions:
raise DecryptLabsRemoteCDMExceptions.InvalidSession(f"Invalid session ID: {session_id.hex()}")
session = self._sessions[session_id]
session["pssh"] = pssh_or_wrm
init_data = self._get_init_data_from_pssh(pssh_or_wrm)
already_tried_cache = session.get("tried_cache", False)
if self.vaults and self._required_kids:
vault_keys = []
for kid_str in self._required_kids:
try:
clean_kid = kid_str.replace("-", "")
if len(clean_kid) == 32:
kid_uuid = UUID(hex=clean_kid)
else:
kid_uuid = UUID(hex=clean_kid.ljust(32, "0"))
key, _ = self.vaults.get_key(kid_uuid)
if key and key.count("0") != len(key):
vault_keys.append({"kid": kid_str, "key": key, "type": "CONTENT"})
except (ValueError, TypeError):
continue
if vault_keys:
vault_kids = set(k["kid"] for k in vault_keys)
required_kids = set(self._required_kids)
if required_kids.issubset(vault_kids):
session["keys"] = vault_keys
return b""
else:
session["vault_keys"] = vault_keys
if self.device_name in ["L1", "L2"]:
get_cached_keys = True
else:
get_cached_keys = not already_tried_cache
request_data = {
"scheme": self.device_name,
"init_data": init_data,
"get_cached_keys_if_exists": get_cached_keys,
}
if self.service_name:
request_data["service"] = self.service_name
if session["service_certificate"]:
request_data["service_certificate"] = base64.b64encode(session["service_certificate"]).decode("utf-8")
response = self._http_session.post(f"{self.host}/get-request", json=request_data, timeout=30)
if response.status_code != 200:
raise requests.RequestException(f"API request failed: {response.status_code} {response.text}")
data = response.json()
if data.get("message") != "success":
error_msg = data.get("message", "Unknown error")
if "details" in data:
error_msg += f" - Details: {data['details']}"
if "error" in data:
error_msg += f" - Error: {data['error']}"
if "service_certificate is required" in str(data) and not session["service_certificate"]:
error_msg += " (No service certificate was provided to the CDM session)"
raise requests.RequestException(f"API error: {error_msg}")
message_type = data.get("message_type")
if message_type == "cached-keys" or "cached_keys" in data:
"""
Handle cached keys response from API.
When the API returns cached keys, we need to determine if they satisfy
our requirements or if we need to make an additional license request
for missing keys.
"""
cached_keys = data.get("cached_keys", [])
parsed_keys = self._parse_cached_keys(cached_keys)
all_available_keys = list(parsed_keys)
if "vault_keys" in session:
all_available_keys.extend(session["vault_keys"])
session["tried_cache"] = True
if self._required_kids:
available_kids = set()
for key in all_available_keys:
if isinstance(key, dict) and "kid" in key:
available_kids.add(key["kid"].replace("-", "").lower())
required_kids = set(self._required_kids)
missing_kids = required_kids - available_kids
if missing_kids:
session["cached_keys"] = parsed_keys
if self.device_name in ["L1", "L2"]:
license_request_data = {
"scheme": self.device_name,
"init_data": init_data,
"get_cached_keys_if_exists": False,
}
if self.service_name:
license_request_data["service"] = self.service_name
if session["service_certificate"]:
license_request_data["service_certificate"] = base64.b64encode(
session["service_certificate"]
).decode("utf-8")
else:
license_request_data = request_data.copy()
license_request_data["get_cached_keys_if_exists"] = False
# Make license request for missing keys
response = self._http_session.post(
f"{self.host}/get-request", json=license_request_data, timeout=30
)
if response.status_code == 200:
data = response.json()
if data.get("message") == "success" and "challenge" in data:
challenge = base64.b64decode(data["challenge"])
session["challenge"] = challenge
session["decrypt_labs_session_id"] = data["session_id"]
return challenge
return b""
else:
# All required keys are available from cache
session["keys"] = all_available_keys
return b""
else:
# No required KIDs specified - return cached keys
session["keys"] = all_available_keys
return b""
if message_type == "license-request" or "challenge" in data:
challenge = base64.b64decode(data["challenge"])
session["challenge"] = challenge
session["decrypt_labs_session_id"] = data["session_id"]
return challenge
error_msg = f"Unexpected API response format. message_type={message_type}, available_fields={list(data.keys())}"
if data.get("message"):
error_msg = f"API response: {data['message']} - {error_msg}"
if "details" in data:
error_msg += f" - Details: {data['details']}"
if "error" in data:
error_msg += f" - Error: {data['error']}"
if already_tried_cache and data.get("message") == "success":
return b""
raise requests.RequestException(error_msg)
def parse_license(self, session_id: bytes, license_message: Union[bytes, str]) -> None:
"""
Parse license response using Decrypt Labs API with intelligent key combination.
For PlayReady content with partial cached keys, this method intelligently
combines the cached keys with newly obtained license keys, avoiding
duplicates while ensuring all required keys are available.
The key combination process:
1. Extracts keys from the license response
2. If cached keys exist (PlayReady), combines them with license keys
3. Removes duplicate keys by comparing normalized KIDs
4. Updates the session with the complete key set
Args:
session_id: Session identifier
license_message: License response from license server
Raises:
ValueError: If session ID is invalid or no challenge available
requests.RequestException: If API request fails
"""
if session_id not in self._sessions:
raise DecryptLabsRemoteCDMExceptions.InvalidSession(f"Invalid session ID: {session_id.hex()}")
session = self._sessions[session_id]
# Skip parsing if we already have final keys (no cached keys to combine)
# If cached_keys exist (Widevine or PlayReady), we need to combine them with license keys
if session["keys"] and "cached_keys" not in session:
return
if not session.get("challenge") or not session.get("decrypt_labs_session_id"):
raise ValueError("No challenge available - call get_license_challenge first")
if isinstance(license_message, str):
if self.is_playready and license_message.strip().startswith("<?xml"):
license_message = license_message.encode("utf-8")
else:
try:
license_message = base64.b64decode(license_message)
except Exception:
license_message = license_message.encode("utf-8")
pssh = session["pssh"]
init_data = self._get_init_data_from_pssh(pssh)
license_request_b64 = base64.b64encode(session["challenge"]).decode("utf-8")
license_response_b64 = base64.b64encode(license_message).decode("utf-8")
request_data = {
"scheme": self.device_name,
"session_id": session["decrypt_labs_session_id"],
"init_data": init_data,
"license_request": license_request_b64,
"license_response": license_response_b64,
}
response = self._http_session.post(f"{self.host}/decrypt-response", json=request_data, timeout=30)
if response.status_code != 200:
raise requests.RequestException(f"License decrypt failed: {response.status_code} {response.text}")
data = response.json()
if data.get("message") != "success":
error_msg = data.get("message", "Unknown error")
if "error" in data:
error_msg += f" - Error: {data['error']}"
if "details" in data:
error_msg += f" - Details: {data['details']}"
raise requests.RequestException(f"License decrypt error: {error_msg}")
license_keys = self._parse_keys_response(data)
all_keys = []
if "vault_keys" in session:
all_keys.extend(session["vault_keys"])
if "cached_keys" in session:
cached_keys = session.get("cached_keys", [])
if cached_keys:
for cached_key in cached_keys:
all_keys.append(cached_key)
for license_key in license_keys:
already_exists = False
license_kid = None
if isinstance(license_key, dict) and "kid" in license_key:
license_kid = license_key["kid"].replace("-", "").lower()
elif hasattr(license_key, "kid"):
license_kid = str(license_key.kid).replace("-", "").lower()
elif hasattr(license_key, "key_id"):
license_kid = str(license_key.key_id).replace("-", "").lower()
if license_kid:
for existing_key in all_keys:
existing_kid = None
if isinstance(existing_key, dict) and "kid" in existing_key:
existing_kid = existing_key["kid"].replace("-", "").lower()
elif hasattr(existing_key, "kid"):
existing_kid = str(existing_key.kid).replace("-", "").lower()
elif hasattr(existing_key, "key_id"):
existing_kid = str(existing_key.key_id).replace("-", "").lower()
if existing_kid == license_kid:
already_exists = True
break
if not already_exists:
all_keys.append(license_key)
session["keys"] = all_keys
session.pop("cached_keys", None)
session.pop("vault_keys", None)
if self.vaults and session["keys"]:
key_dict = {}
for key in session["keys"]:
if key["type"] == "CONTENT":
try:
clean_kid = key["kid"].replace("-", "")
if len(clean_kid) == 32:
kid_uuid = UUID(hex=clean_kid)
else:
kid_uuid = UUID(hex=clean_kid.ljust(32, "0"))
key_dict[kid_uuid] = key["key"]
except (ValueError, TypeError):
continue
if key_dict:
self.vaults.add_keys(key_dict)
def get_keys(self, session_id: bytes, type_: Optional[str] = None) -> List[Key]:
"""
Get keys from the session.
Args:
session_id: Session identifier
type_: Optional key type filter (CONTENT, SIGNING, etc.)
Returns:
List of Key objects
Raises:
InvalidSession: If session ID is invalid
"""
if session_id not in self._sessions:
raise DecryptLabsRemoteCDMExceptions.InvalidSession(f"Invalid session ID: {session_id.hex()}")
key_dicts = self._sessions[session_id]["keys"]
keys = [Key(kid=k["kid"], key=k["key"], type_=k["type"]) for k in key_dicts]
if type_:
keys = [key for key in keys if key.type == type_]
return keys
def _parse_cached_keys(self, cached_keys_data: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Parse cached keys from API response.
Args:
cached_keys_data: List of cached key objects from API
Returns:
List of key dictionaries
"""
keys = []
try:
if cached_keys_data and isinstance(cached_keys_data, list):
for key_data in cached_keys_data:
if "kid" in key_data and "key" in key_data:
keys.append({"kid": key_data["kid"], "key": key_data["key"], "type": "CONTENT"})
except Exception:
pass
return keys
def _parse_keys_response(self, data: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Parse keys from decrypt response."""
keys = []
if "keys" in data and isinstance(data["keys"], str):
keys_string = data["keys"]
for line in keys_string.split("\n"):
line = line.strip()
if line.startswith("--key "):
key_part = line[6:]
if ":" in key_part:
kid, key = key_part.split(":", 1)
keys.append({"kid": kid.strip(), "key": key.strip(), "type": "CONTENT"})
elif "keys" in data and isinstance(data["keys"], list):
for key_data in data["keys"]:
keys.append(
{"kid": key_data.get("kid"), "key": key_data.get("key"), "type": key_data.get("type", "CONTENT")}
)
return keys
__all__ = ["DecryptLabsRemoteCDM"]

View File

@ -0,0 +1,35 @@
from typing import Optional
import click
from unshackle.core.config import config
from unshackle.core.utilities import import_module_by_path
_COMMANDS = sorted(
(path for path in config.directories.commands.glob("*.py") if path.stem.lower() != "__init__"), key=lambda x: x.stem
)
_MODULES = {path.stem: getattr(import_module_by_path(path), path.stem) for path in _COMMANDS}
class Commands(click.MultiCommand):
"""Lazy-loaded command group of project commands."""
def list_commands(self, ctx: click.Context) -> list[str]:
"""Returns a list of command names from the command filenames."""
return [x.stem for x in _COMMANDS]
def get_command(self, ctx: click.Context, name: str) -> Optional[click.Command]:
"""Load the command code and return the main click command function."""
module = _MODULES.get(name)
if not module:
raise click.ClickException(f"Unable to find command by the name '{name}'")
if hasattr(module, "cli"):
return module.cli
return module
# Hide direct access to commands from quick import form, they shouldn't be accessed directly
__all__ = ("Commands",)

144
unshackle/core/config.py Normal file
View File

@ -0,0 +1,144 @@
from __future__ import annotations
from pathlib import Path
from typing import Any, Optional
import yaml
from appdirs import AppDirs
class Config:
class _Directories:
# default directories, do not modify here, set via config
app_dirs = AppDirs("unshackle", False)
core_dir = Path(__file__).resolve().parent
namespace_dir = core_dir.parent
commands = namespace_dir / "commands"
services = [namespace_dir / "services"]
vaults = namespace_dir / "vaults"
fonts = namespace_dir / "fonts"
user_configs = core_dir.parent
data = core_dir.parent
downloads = core_dir.parent.parent / "downloads"
temp = core_dir.parent.parent / "temp"
cache = data / "cache"
cookies = data / "cookies"
logs = data / "logs"
wvds = data / "WVDs"
prds = data / "PRDs"
dcsl = data / "DCSL"
class _Filenames:
# default filenames, do not modify here, set via config
log = "unshackle_{name}_{time}.log" # Directories.logs
debug_log = "unshackle_debug_{service}_{time}.jsonl" # Directories.logs
config = "config.yaml" # Directories.services / tag
root_config = "unshackle.yaml" # Directories.user_configs
chapters = "Chapters_{title}_{random}.txt" # Directories.temp
subtitle = "Subtitle_{id}_{language}.srt" # Directories.temp
def __init__(self, **kwargs: Any):
self.dl: dict = kwargs.get("dl") or {}
self.aria2c: dict = kwargs.get("aria2c") or {}
self.n_m3u8dl_re: dict = kwargs.get("n_m3u8dl_re") or {}
self.cdm: dict = kwargs.get("cdm") or {}
self.chapter_fallback_name: str = kwargs.get("chapter_fallback_name") or ""
self.curl_impersonate: dict = kwargs.get("curl_impersonate") or {}
self.remote_cdm: list[dict] = kwargs.get("remote_cdm") or []
self.credentials: dict = kwargs.get("credentials") or {}
self.subtitle: dict = kwargs.get("subtitle") or {}
self.directories = self._Directories()
for name, path in (kwargs.get("directories") or {}).items():
if name.lower() in ("app_dirs", "core_dir", "namespace_dir", "user_configs", "data"):
# these must not be modified by the user
continue
if name == "services" and isinstance(path, list):
setattr(self.directories, name, [Path(p).expanduser() for p in path])
else:
setattr(self.directories, name, Path(path).expanduser())
downloader_cfg = kwargs.get("downloader") or "requests"
if isinstance(downloader_cfg, dict):
self.downloader_map = {k.upper(): v for k, v in downloader_cfg.items()}
self.downloader = self.downloader_map.get("DEFAULT", "requests")
else:
self.downloader_map = {}
self.downloader = downloader_cfg
self.filenames = self._Filenames()
for name, filename in (kwargs.get("filenames") or {}).items():
setattr(self.filenames, name, filename)
self.headers: dict = kwargs.get("headers") or {}
self.key_vaults: list[dict[str, Any]] = kwargs.get("key_vaults", [])
self.muxing: dict = kwargs.get("muxing") or {}
self.proxy_providers: dict = kwargs.get("proxy_providers") or {}
self.serve: dict = kwargs.get("serve") or {}
self.services: dict = kwargs.get("services") or {}
decryption_cfg = kwargs.get("decryption") or {}
if isinstance(decryption_cfg, dict):
self.decryption_map = {k.upper(): v for k, v in decryption_cfg.items()}
self.decryption = self.decryption_map.get("DEFAULT", "shaka")
else:
self.decryption_map = {}
self.decryption = decryption_cfg or "shaka"
self.set_terminal_bg: bool = kwargs.get("set_terminal_bg", False)
self.tag: str = kwargs.get("tag") or ""
self.tag_group_name: bool = kwargs.get("tag_group_name", True)
self.tag_imdb_tmdb: bool = kwargs.get("tag_imdb_tmdb", True)
self.tmdb_api_key: str = kwargs.get("tmdb_api_key") or ""
self.simkl_client_id: str = kwargs.get("simkl_client_id") or ""
self.decrypt_labs_api_key: str = kwargs.get("decrypt_labs_api_key") or ""
self.update_checks: bool = kwargs.get("update_checks", True)
self.update_check_interval: int = kwargs.get("update_check_interval", 24)
self.scene_naming: bool = kwargs.get("scene_naming", True)
self.series_year: bool = kwargs.get("series_year", True)
self.title_cache_time: int = kwargs.get("title_cache_time", 1800) # 30 minutes default
self.title_cache_max_retention: int = kwargs.get("title_cache_max_retention", 86400) # 24 hours default
self.title_cache_enabled: bool = kwargs.get("title_cache_enabled", True)
self.debug: bool = kwargs.get("debug", False)
self.debug_keys: bool = kwargs.get("debug_keys", False)
@classmethod
def from_yaml(cls, path: Path) -> Config:
if not path.exists():
raise FileNotFoundError(f"Config file path ({path}) was not found")
if not path.is_file():
raise FileNotFoundError(f"Config file path ({path}) is not to a file.")
return cls(**yaml.safe_load(path.read_text(encoding="utf8")) or {})
# noinspection PyProtectedMember
POSSIBLE_CONFIG_PATHS = (
# The unshackle Namespace Folder (e.g., %appdata%/Python/Python311/site-packages/unshackle)
Config._Directories.namespace_dir / Config._Filenames.root_config,
# The Parent Folder to the unshackle Namespace Folder (e.g., %appdata%/Python/Python311/site-packages)
Config._Directories.namespace_dir.parent / Config._Filenames.root_config,
# The AppDirs User Config Folder (e.g., ~/.config/unshackle on Linux, %LOCALAPPDATA%\unshackle on Windows)
Path(Config._Directories.app_dirs.user_config_dir) / Config._Filenames.root_config,
)
def get_config_path() -> Optional[Path]:
"""
Get Path to Config from any one of the possible locations.
Returns None if no config file could be found.
"""
for path in POSSIBLE_CONFIG_PATHS:
if path.exists():
return path
return None
config_path = get_config_path()
if config_path:
config = Config.from_yaml(config_path)
else:
config = Config()
__all__ = ("config",)

351
unshackle/core/console.py Normal file
View File

@ -0,0 +1,351 @@
import logging
from datetime import datetime
from types import ModuleType
from typing import IO, Callable, Iterable, List, Literal, Mapping, Optional, Union
from rich._log_render import FormatTimeCallable, LogRender
from rich.console import Console, ConsoleRenderable, HighlighterType, RenderableType
from rich.emoji import EmojiVariant
from rich.highlighter import Highlighter, ReprHighlighter
from rich.live import Live
from rich.logging import RichHandler
from rich.padding import Padding, PaddingDimensions
from rich.status import Status
from rich.style import StyleType
from rich.table import Table
from rich.text import Text, TextType
from rich.theme import Theme
from unshackle.core.config import config
class ComfyLogRenderer(LogRender):
def __call__(
self,
console: "Console",
renderables: Iterable["ConsoleRenderable"],
log_time: Optional[datetime] = None,
time_format: Optional[Union[str, FormatTimeCallable]] = None,
level: TextType = "",
path: Optional[str] = None,
line_no: Optional[int] = None,
link_path: Optional[str] = None,
) -> "Table":
from rich.containers import Renderables
output = Table.grid(padding=(0, 5), pad_edge=True)
output.expand = True
if self.show_time:
output.add_column(style="log.time")
if self.show_level:
output.add_column(style="log.level", width=self.level_width)
output.add_column(ratio=1, style="log.message", overflow="fold")
if self.show_path and path:
output.add_column(style="log.path")
row: List["RenderableType"] = []
if self.show_time:
log_time = log_time or console.get_datetime()
time_format = time_format or self.time_format
if callable(time_format):
log_time_display = time_format(log_time)
else:
log_time_display = Text(log_time.strftime(time_format))
if log_time_display == self._last_time and self.omit_repeated_times:
row.append(Text(" " * len(log_time_display)))
else:
row.append(log_time_display)
self._last_time = log_time_display
if self.show_level:
row.append(level)
row.append(Renderables(renderables))
if self.show_path and path:
path_text = Text()
path_text.append(path, style=f"link file://{link_path}" if link_path else "")
if line_no:
path_text.append(":")
path_text.append(
f"{line_no}",
style=f"link file://{link_path}#{line_no}" if link_path else "",
)
row.append(path_text)
output.add_row(*row)
return output
class ComfyRichHandler(RichHandler):
def __init__(
self,
level: Union[int, str] = logging.NOTSET,
console: Optional[Console] = None,
*,
show_time: bool = True,
omit_repeated_times: bool = True,
show_level: bool = True,
show_path: bool = True,
enable_link_path: bool = True,
highlighter: Optional[Highlighter] = None,
markup: bool = False,
rich_tracebacks: bool = False,
tracebacks_width: Optional[int] = None,
tracebacks_extra_lines: int = 3,
tracebacks_theme: Optional[str] = None,
tracebacks_word_wrap: bool = True,
tracebacks_show_locals: bool = False,
tracebacks_suppress: Iterable[Union[str, ModuleType]] = (),
locals_max_length: int = 10,
locals_max_string: int = 80,
log_time_format: Union[str, FormatTimeCallable] = "[%x %X]",
keywords: Optional[List[str]] = None,
log_renderer: Optional[LogRender] = None,
) -> None:
super().__init__(
level=level,
console=console,
show_time=show_time,
omit_repeated_times=omit_repeated_times,
show_level=show_level,
show_path=show_path,
enable_link_path=enable_link_path,
highlighter=highlighter,
markup=markup,
rich_tracebacks=rich_tracebacks,
tracebacks_width=tracebacks_width,
tracebacks_extra_lines=tracebacks_extra_lines,
tracebacks_theme=tracebacks_theme,
tracebacks_word_wrap=tracebacks_word_wrap,
tracebacks_show_locals=tracebacks_show_locals,
tracebacks_suppress=tracebacks_suppress,
locals_max_length=locals_max_length,
locals_max_string=locals_max_string,
log_time_format=log_time_format,
keywords=keywords,
)
if log_renderer:
self._log_render = log_renderer
class ComfyConsole(Console):
"""A comfy high level console interface.
Args:
color_system (str, optional): The color system supported by your terminal,
either ``"standard"``, ``"256"`` or ``"truecolor"``. Leave as ``"auto"`` to autodetect.
force_terminal (Optional[bool], optional): Enable/disable terminal control codes, or None to auto-detect
terminal. Defaults to None.
force_jupyter (Optional[bool], optional): Enable/disable Jupyter rendering, or None to auto-detect Jupyter.
Defaults to None.
force_interactive (Optional[bool], optional): Enable/disable interactive mode, or None to auto-detect.
Defaults to None.
soft_wrap (Optional[bool], optional): Set soft wrap default on print method. Defaults to False.
theme (Theme, optional): An optional style theme object, or ``None`` for default theme.
stderr (bool, optional): Use stderr rather than stdout if ``file`` is not specified. Defaults to False.
file (IO, optional): A file object where the console should write to. Defaults to stdout.
quiet (bool, Optional): Boolean to suppress all output. Defaults to False.
width (int, optional): The width of the terminal. Leave as default to auto-detect width.
height (int, optional): The height of the terminal. Leave as default to auto-detect height.
style (StyleType, optional): Style to apply to all output, or None for no style. Defaults to None.
no_color (Optional[bool], optional): Enabled no color mode, or None to auto-detect. Defaults to None.
tab_size (int, optional): Number of spaces used to replace a tab character. Defaults to 8.
record (bool, optional): Boolean to enable recording of terminal output,
required to call :meth:`export_html`, :meth:`export_svg`, and :meth:`export_text`. Defaults to False.
markup (bool, optional): Boolean to enable :ref:`console_markup`. Defaults to True.
emoji (bool, optional): Enable emoji code. Defaults to True.
emoji_variant (str, optional): Optional emoji variant, either "text" or "emoji". Defaults to None.
highlight (bool, optional): Enable automatic highlighting. Defaults to True.
log_time (bool, optional): Boolean to enable logging of time by :meth:`log` methods. Defaults to True.
log_path (bool, optional): Boolean to enable the logging of the caller by :meth:`log`. Defaults to True.
log_time_format (Union[str, TimeFormatterCallable], optional): If ``log_time`` is enabled, either string for
strftime or callable that formats the time. Defaults to "[%X] ".
highlighter (HighlighterType, optional): Default highlighter.
legacy_windows (bool, optional): Enable legacy Windows mode, or ``None`` to auto-detect. Defaults to ``None``.
safe_box (bool, optional): Restrict box options that don't render on legacy Windows.
get_datetime (Callable[[], datetime], optional): Callable that gets the current time as a datetime.datetime
object (used by Console.log), or None for datetime.now.
get_time (Callable[[], time], optional): Callable that gets the current time in seconds, default uses
time.monotonic.
"""
def __init__(
self,
*,
color_system: Optional[Literal["auto", "standard", "256", "truecolor", "windows"]] = "auto",
force_terminal: Optional[bool] = None,
force_jupyter: Optional[bool] = None,
force_interactive: Optional[bool] = None,
soft_wrap: bool = False,
theme: Optional[Theme] = None,
stderr: bool = False,
file: Optional[IO[str]] = None,
quiet: bool = False,
width: Optional[int] = None,
height: Optional[int] = None,
style: Optional[StyleType] = None,
no_color: Optional[bool] = None,
tab_size: int = 8,
record: bool = False,
markup: bool = True,
emoji: bool = True,
emoji_variant: Optional[EmojiVariant] = None,
highlight: bool = True,
log_time: bool = True,
log_path: bool = True,
log_time_format: Union[str, FormatTimeCallable] = "[%X]",
highlighter: Optional["HighlighterType"] = ReprHighlighter(),
legacy_windows: Optional[bool] = None,
safe_box: bool = True,
get_datetime: Optional[Callable[[], datetime]] = None,
get_time: Optional[Callable[[], float]] = None,
_environ: Optional[Mapping[str, str]] = None,
log_renderer: Optional[LogRender] = None,
):
super().__init__(
color_system=color_system,
force_terminal=force_terminal,
force_jupyter=force_jupyter,
force_interactive=force_interactive,
soft_wrap=soft_wrap,
theme=theme,
stderr=stderr,
file=file,
quiet=quiet,
width=width,
height=height,
style=style,
no_color=no_color,
tab_size=tab_size,
record=record,
markup=markup,
emoji=emoji,
emoji_variant=emoji_variant,
highlight=highlight,
log_time=log_time,
log_path=log_path,
log_time_format=log_time_format,
highlighter=highlighter,
legacy_windows=legacy_windows,
safe_box=safe_box,
get_datetime=get_datetime,
get_time=get_time,
_environ=_environ,
)
if log_renderer:
self._log_render = log_renderer
def status(
self,
status: RenderableType,
*,
spinner: str = "dots",
spinner_style: str = "status.spinner",
speed: float = 1.0,
refresh_per_second: float = 12.5,
pad: PaddingDimensions = (0, 5),
) -> Union[Live, Status]:
"""Display a comfy status and spinner.
Args:
status (RenderableType): A status renderable (str or Text typically).
spinner (str, optional): Name of spinner animation (see python -m rich.spinner). Defaults to "dots".
spinner_style (StyleType, optional): Style of spinner. Defaults to "status.spinner".
speed (float, optional): Speed factor for spinner animation. Defaults to 1.0.
refresh_per_second (float, optional): Number of refreshes per second. Defaults to 12.5.
pad (Union[int, Tuple[int]]): Padding for top, right, bottom, and left borders.
May be specified with 1, 2, or 4 integers (CSS style).
Returns:
Status: A Status object that may be used as a context manager.
"""
status_renderable = super().status(
status=status,
spinner=spinner,
spinner_style=spinner_style,
speed=speed,
refresh_per_second=refresh_per_second,
)
if pad:
top, right, bottom, left = Padding.unpack(pad)
renderable_width = len(status_renderable.status)
spinner_width = len(status_renderable.renderable.text)
status_width = spinner_width + renderable_width
available_width = self.width - status_width
if available_width > right:
# fill up the available width with padding to apply bg color
right = available_width - right
padding = Padding(status_renderable, (top, right, bottom, left))
return Live(padding, console=self, transient=True)
return status_renderable
catppuccin_mocha = {
# Colors based on "CatppuccinMocha" from Gogh themes
"bg": "rgb(30,30,46)",
"text": "rgb(205,214,244)",
"text2": "rgb(162,169,193)", # slightly darker
"black": "rgb(69,71,90)",
"bright_black": "rgb(88,91,112)",
"red": "rgb(243,139,168)",
"green": "rgb(166,227,161)",
"yellow": "rgb(249,226,175)",
"blue": "rgb(137,180,250)",
"pink": "rgb(245,194,231)",
"cyan": "rgb(148,226,213)",
"gray": "rgb(166,173,200)",
"bright_gray": "rgb(186,194,222)",
"dark_gray": "rgb(54,54,84)",
}
primary_scheme = catppuccin_mocha
primary_scheme["none"] = primary_scheme["text"]
primary_scheme["grey23"] = primary_scheme["black"]
primary_scheme["magenta"] = primary_scheme["pink"]
primary_scheme["bright_red"] = primary_scheme["red"]
primary_scheme["bright_green"] = primary_scheme["green"]
primary_scheme["bright_yellow"] = primary_scheme["yellow"]
primary_scheme["bright_blue"] = primary_scheme["blue"]
primary_scheme["bright_magenta"] = primary_scheme["pink"]
primary_scheme["bright_cyan"] = primary_scheme["cyan"]
if config.set_terminal_bg:
primary_scheme["none"] += f" on {primary_scheme['bg']}"
custom_colors = {"ascii.art": primary_scheme["pink"]}
if config.set_terminal_bg:
custom_colors["ascii.art"] += f" on {primary_scheme['bg']}"
console = ComfyConsole(
log_time=False,
log_path=False,
width=80,
theme=Theme(
{
"bar.back": primary_scheme["dark_gray"],
"bar.complete": primary_scheme["pink"],
"bar.finished": primary_scheme["green"],
"bar.pulse": primary_scheme["bright_black"],
"black": primary_scheme["black"],
"inspect.async_def": f"italic {primary_scheme['cyan']}",
"progress.data.speed": "dark_orange",
"repr.number": f"bold not italic {primary_scheme['cyan']}",
"repr.number_complex": f"bold not italic {primary_scheme['cyan']}",
"rule.line": primary_scheme["dark_gray"],
"rule.text": primary_scheme["pink"],
"tree.line": primary_scheme["dark_gray"],
"status.spinner": primary_scheme["pink"],
"progress.spinner": primary_scheme["pink"],
**primary_scheme,
**custom_colors,
}
),
log_renderer=ComfyLogRenderer(show_time=False, show_path=False),
)
__all__ = ("ComfyLogRenderer", "ComfyRichHandler", "ComfyConsole", "console")

View File

@ -0,0 +1,32 @@
from threading import Event
from typing import TypeVar, Union
DOWNLOAD_CANCELLED = Event()
DOWNLOAD_LICENCE_ONLY = Event()
DRM_SORT_MAP = ["ClearKey", "Widevine"]
LANGUAGE_MAX_DISTANCE = 5 # this is max to be considered "same", e.g., en, en-US, en-AU
LANGUAGE_EXACT_DISTANCE = 0 # exact match only, no variants
VIDEO_CODEC_MAP = {"AVC": "H.264", "HEVC": "H.265"}
DYNAMIC_RANGE_MAP = {
"HDR10": "HDR",
"HDR10+": "HDR10P",
"Dolby Vision": "DV",
"HDR10 / HDR10+": "HDR10P",
"HDR10 / HDR10": "HDR",
}
AUDIO_CODEC_MAP = {"E-AC-3": "DDP", "AC-3": "DD"}
context_settings = dict(
help_option_names=["-?", "-h", "--help"], # default only has --help
max_content_width=116, # max PEP8 line-width, -4 to adjust for initial indent
)
# For use in signatures of functions which take one specific type of track at a time
# (it can't be a list that contains e.g. both Video and Audio objects)
TrackT = TypeVar("TrackT", bound="Track") # noqa: F821
# For general use in lists that can contain mixed types of tracks.
# list[Track] won't work because list is invariant.
# TODO: Add Chapter?
AnyTrack = Union["Video", "Audio", "Subtitle"] # noqa: F821

View File

@ -0,0 +1,87 @@
from __future__ import annotations
import base64
import hashlib
import re
from pathlib import Path
from typing import Optional, Union
class Credential:
"""Username (or Email) and Password Credential."""
def __init__(self, username: str, password: str, extra: Optional[str] = None):
self.username = username
self.password = password
self.extra = extra
self.sha1 = hashlib.sha1(self.dumps().encode()).hexdigest()
def __bool__(self) -> bool:
return bool(self.username) and bool(self.password)
def __str__(self) -> str:
return self.dumps()
def __repr__(self) -> str:
return "{name}({items})".format(
name=self.__class__.__name__, items=", ".join([f"{k}={repr(v)}" for k, v in self.__dict__.items()])
)
def dumps(self) -> str:
"""Return credential data as a string."""
return f"{self.username}:{self.password}" + (f":{self.extra}" if self.extra else "")
def dump(self, path: Union[Path, str]) -> int:
"""Write credential data to a file."""
if isinstance(path, str):
path = Path(path)
return path.write_text(self.dumps(), encoding="utf8")
def as_base64(self, with_extra: bool = False, encode_password: bool = False, encode_extra: bool = False) -> str:
"""
Dump Credential as a Base64-encoded string in Basic Authorization style.
encode_password and encode_extra will also Base64-encode the password and extra respectively.
"""
value = f"{self.username}:"
if encode_password:
value += base64.b64encode(self.password.encode()).decode()
else:
value += self.password
if with_extra and self.extra:
if encode_extra:
value += f":{base64.b64encode(self.extra.encode()).decode()}"
else:
value += f":{self.extra}"
return base64.b64encode(value.encode()).decode()
@classmethod
def loads(cls, text: str) -> Credential:
"""
Load credential from a text string.
Format: {username}:{password}
Rules:
Only one Credential must be in this text contents.
All whitespace before and after all text will be removed.
Any whitespace between text will be kept and used.
The credential can be spanned across one or multiple lines as long as it
abides with all the above rules and the format.
Example that follows the format and rules:
`\tJohnd\noe@gm\n\rail.com\n:Pass1\n23\n\r \t \t`
>>>Credential(username='Johndoe@gmail.com', password='Pass123')
"""
text = "".join([x.strip() for x in text.splitlines(keepends=False)]).strip()
credential = re.fullmatch(r"^([^:]+?):([^:]+?)(?::(.+))?$", text)
if credential:
return cls(*credential.groups())
raise ValueError("No credentials found in text string. Expecting the format `username:password`")
@classmethod
def load(cls, path: Path) -> Credential:
"""
Load Credential from a file path.
Use Credential.loads() for loading from text content and seeing the rules and
format expected to be found in the URIs contents.
"""
return cls.loads(path.read_text("utf8"))

View File

@ -0,0 +1,6 @@
from .aria2c import aria2c
from .curl_impersonate import curl_impersonate
from .n_m3u8dl_re import n_m3u8dl_re
from .requests import requests
__all__ = ("aria2c", "curl_impersonate", "requests", "n_m3u8dl_re")

View File

@ -0,0 +1,331 @@
import os
import subprocess
import textwrap
import time
from functools import partial
from http.cookiejar import CookieJar
from pathlib import Path
from typing import Any, Callable, Generator, MutableMapping, Optional, Union
from urllib.parse import urlparse
import requests
from Crypto.Random import get_random_bytes
from requests import Session
from requests.cookies import cookiejar_from_dict, get_cookie_header
from rich import filesize
from rich.text import Text
from unshackle.core import binaries
from unshackle.core.config import config
from unshackle.core.console import console
from unshackle.core.constants import DOWNLOAD_CANCELLED
from unshackle.core.utilities import get_extension, get_free_port
def rpc(caller: Callable, secret: str, method: str, params: Optional[list[Any]] = None) -> Any:
"""Make a call to Aria2's JSON-RPC API."""
try:
rpc_res = caller(
json={
"jsonrpc": "2.0",
"id": get_random_bytes(16).hex(),
"method": method,
"params": [f"token:{secret}", *(params or [])],
}
).json()
if rpc_res.get("code"):
# wrap to console width - padding - '[Aria2c]: '
error_pretty = "\n ".join(
textwrap.wrap(
f"RPC Error: {rpc_res['message']} ({rpc_res['code']})".strip(),
width=console.width - 20,
initial_indent="",
)
)
console.log(Text.from_ansi("\n[Aria2c]: " + error_pretty))
return rpc_res["result"]
except requests.exceptions.ConnectionError:
# absorb, process likely ended as it was calling RPC
return
def download(
urls: Union[str, list[str], dict[str, Any], list[dict[str, Any]]],
output_dir: Path,
filename: str,
headers: Optional[MutableMapping[str, Union[str, bytes]]] = None,
cookies: Optional[Union[MutableMapping[str, str], CookieJar]] = None,
proxy: Optional[str] = None,
max_workers: Optional[int] = None,
) -> Generator[dict[str, Any], None, None]:
if not urls:
raise ValueError("urls must be provided and not empty")
elif not isinstance(urls, (str, dict, list)):
raise TypeError(f"Expected urls to be {str} or {dict} or a list of one of them, not {type(urls)}")
if not output_dir:
raise ValueError("output_dir must be provided")
elif not isinstance(output_dir, Path):
raise TypeError(f"Expected output_dir to be {Path}, not {type(output_dir)}")
if not filename:
raise ValueError("filename must be provided")
elif not isinstance(filename, str):
raise TypeError(f"Expected filename to be {str}, not {type(filename)}")
if not isinstance(headers, (MutableMapping, type(None))):
raise TypeError(f"Expected headers to be {MutableMapping}, not {type(headers)}")
if not isinstance(cookies, (MutableMapping, CookieJar, type(None))):
raise TypeError(f"Expected cookies to be {MutableMapping} or {CookieJar}, not {type(cookies)}")
if not isinstance(proxy, (str, type(None))):
raise TypeError(f"Expected proxy to be {str}, not {type(proxy)}")
if not max_workers:
max_workers = min(32, (os.cpu_count() or 1) + 4)
elif not isinstance(max_workers, int):
raise TypeError(f"Expected max_workers to be {int}, not {type(max_workers)}")
if not isinstance(urls, list):
urls = [urls]
if not binaries.Aria2:
raise EnvironmentError("Aria2c executable not found...")
if proxy and not proxy.lower().startswith("http://"):
raise ValueError("Only HTTP proxies are supported by aria2(c)")
if cookies and not isinstance(cookies, CookieJar):
cookies = cookiejar_from_dict(cookies)
url_files = []
for i, url in enumerate(urls):
if isinstance(url, str):
url_data = {"url": url}
else:
url_data: dict[str, Any] = url
url_filename = filename.format(i=i, ext=get_extension(url_data["url"]))
url_text = url_data["url"]
url_text += f"\n\tdir={output_dir}"
url_text += f"\n\tout={url_filename}"
if cookies:
mock_request = requests.Request(url=url_data["url"])
cookie_header = get_cookie_header(cookies, mock_request)
if cookie_header:
url_text += f"\n\theader=Cookie: {cookie_header}"
for key, value in url_data.items():
if key == "url":
continue
if key == "headers":
for header_name, header_value in value.items():
url_text += f"\n\theader={header_name}: {header_value}"
else:
url_text += f"\n\t{key}={value}"
url_files.append(url_text)
url_file = "\n".join(url_files)
rpc_port = get_free_port()
rpc_secret = get_random_bytes(16).hex()
rpc_uri = f"http://127.0.0.1:{rpc_port}/jsonrpc"
rpc_session = Session()
max_concurrent_downloads = int(config.aria2c.get("max_concurrent_downloads", max_workers))
max_connection_per_server = int(config.aria2c.get("max_connection_per_server", 1))
split = int(config.aria2c.get("split", 5))
file_allocation = config.aria2c.get("file_allocation", "prealloc")
if len(urls) > 1:
split = 1
file_allocation = "none"
arguments = [
# [Basic Options]
"--input-file",
"-",
"--all-proxy",
proxy or "",
"--continue=true",
# [Connection Options]
f"--max-concurrent-downloads={max_concurrent_downloads}",
f"--max-connection-per-server={max_connection_per_server}",
f"--split={split}", # each split uses their own connection
"--max-file-not-found=5", # counted towards --max-tries
"--max-tries=5",
"--retry-wait=2",
# [Advanced Options]
"--allow-overwrite=true",
"--auto-file-renaming=false",
"--console-log-level=warn",
"--download-result=default",
f"--file-allocation={file_allocation}",
"--summary-interval=0",
# [RPC Options]
"--enable-rpc=true",
f"--rpc-listen-port={rpc_port}",
f"--rpc-secret={rpc_secret}",
]
for header, value in (headers or {}).items():
if header.lower() == "cookie":
raise ValueError("You cannot set Cookies as a header manually, please use the `cookies` param.")
if header.lower() == "accept-encoding":
# we cannot set an allowed encoding, or it will return compressed
# and the code is not set up to uncompress the data
continue
if header.lower() == "referer":
arguments.extend(["--referer", value])
continue
if header.lower() == "user-agent":
arguments.extend(["--user-agent", value])
continue
arguments.extend(["--header", f"{header}: {value}"])
yield dict(total=len(urls))
try:
p = subprocess.Popen([binaries.Aria2, *arguments], stdin=subprocess.PIPE, stdout=subprocess.DEVNULL)
p.stdin.write(url_file.encode())
p.stdin.close()
while p.poll() is None:
global_stats: dict[str, Any] = (
rpc(caller=partial(rpc_session.post, url=rpc_uri), secret=rpc_secret, method="aria2.getGlobalStat")
or {}
)
number_stopped = int(global_stats.get("numStoppedTotal", 0))
download_speed = int(global_stats.get("downloadSpeed", -1))
if number_stopped:
yield dict(completed=number_stopped)
if download_speed != -1:
yield dict(downloaded=f"{filesize.decimal(download_speed)}/s")
stopped_downloads: list[dict[str, Any]] = (
rpc(
caller=partial(rpc_session.post, url=rpc_uri),
secret=rpc_secret,
method="aria2.tellStopped",
params=[0, 999999],
)
or []
)
for dl in stopped_downloads:
if dl["status"] == "error":
used_uri = next(
uri["uri"]
for file in dl["files"]
if file["selected"] == "true"
for uri in file["uris"]
if uri["status"] == "used"
)
error = f"Download Error (#{dl['gid']}): {dl['errorMessage']} ({dl['errorCode']}), {used_uri}"
error_pretty = "\n ".join(
textwrap.wrap(error, width=console.width - 20, initial_indent="")
)
console.log(Text.from_ansi("\n[Aria2c]: " + error_pretty))
raise ValueError(error)
if number_stopped == len(urls):
rpc(caller=partial(rpc_session.post, url=rpc_uri), secret=rpc_secret, method="aria2.shutdown")
break
time.sleep(1)
p.wait()
if p.returncode != 0:
raise subprocess.CalledProcessError(p.returncode, arguments)
except ConnectionResetError:
# interrupted while passing URI to download
raise KeyboardInterrupt()
except subprocess.CalledProcessError as e:
if e.returncode in (7, 0xC000013A):
# 7 is when Aria2(c) handled the CTRL+C
# 0xC000013A is when it never got the chance to
raise KeyboardInterrupt()
raise
except KeyboardInterrupt:
DOWNLOAD_CANCELLED.set() # skip pending track downloads
yield dict(downloaded="[yellow]CANCELLED")
raise
except Exception:
DOWNLOAD_CANCELLED.set() # skip pending track downloads
yield dict(downloaded="[red]FAILED")
raise
finally:
rpc(caller=partial(rpc_session.post, url=rpc_uri), secret=rpc_secret, method="aria2.shutdown")
def aria2c(
urls: Union[str, list[str], dict[str, Any], list[dict[str, Any]]],
output_dir: Path,
filename: str,
headers: Optional[MutableMapping[str, Union[str, bytes]]] = None,
cookies: Optional[Union[MutableMapping[str, str], CookieJar]] = None,
proxy: Optional[str] = None,
max_workers: Optional[int] = None,
) -> Generator[dict[str, Any], None, None]:
"""
Download files using Aria2(c).
https://aria2.github.io
Yields the following download status updates while chunks are downloading:
- {total: 100} (100% download total)
- {completed: 1} (1% download progress out of 100%)
- {downloaded: "10.1 MB/s"} (currently downloading at a rate of 10.1 MB/s)
The data is in the same format accepted by rich's progress.update() function.
Parameters:
urls: Web URL(s) to file(s) to download. You can use a dictionary with the key
"url" for the URI, and other keys for extra arguments to use per-URL.
output_dir: The folder to save the file into. If the save path's directory does
not exist then it will be made automatically.
filename: The filename or filename template to use for each file. The variables
you can use are `i` for the URL index and `ext` for the URL extension.
headers: A mapping of HTTP Header Key/Values to use for all downloads.
cookies: A mapping of Cookie Key/Values or a Cookie Jar to use for all downloads.
proxy: An optional proxy URI to route connections through for all downloads.
max_workers: The maximum amount of threads to use for downloads. Defaults to
min(32,(cpu_count+4)). Use for the --max-concurrent-downloads option.
"""
if proxy and not proxy.lower().startswith("http://"):
# Only HTTP proxies are supported by aria2(c)
proxy = urlparse(proxy)
port = get_free_port()
username, password = get_random_bytes(8).hex(), get_random_bytes(8).hex()
local_proxy = f"http://{username}:{password}@localhost:{port}"
scheme = {"https": "http+ssl", "socks5h": "socks"}.get(proxy.scheme, proxy.scheme)
remote_server = f"{scheme}://{proxy.hostname}"
if proxy.port:
remote_server += f":{proxy.port}"
if proxy.username or proxy.password:
remote_server += "#"
if proxy.username:
remote_server += proxy.username
if proxy.password:
remote_server += f":{proxy.password}"
p = subprocess.Popen(
["pproxy", "-l", f"http://:{port}#{username}:{password}", "-r", remote_server],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
try:
yield from download(urls, output_dir, filename, headers, cookies, local_proxy, max_workers)
finally:
p.kill()
p.wait()
return
yield from download(urls, output_dir, filename, headers, cookies, proxy, max_workers)
__all__ = ("aria2c",)

View File

@ -0,0 +1,264 @@
import math
import time
from concurrent import futures
from concurrent.futures.thread import ThreadPoolExecutor
from http.cookiejar import CookieJar
from pathlib import Path
from typing import Any, Generator, MutableMapping, Optional, Union
from curl_cffi.requests import Session
from rich import filesize
from unshackle.core.config import config
from unshackle.core.constants import DOWNLOAD_CANCELLED
from unshackle.core.utilities import get_extension
MAX_ATTEMPTS = 5
RETRY_WAIT = 2
CHUNK_SIZE = 1024
PROGRESS_WINDOW = 5
BROWSER = config.curl_impersonate.get("browser", "chrome124")
def download(url: str, save_path: Path, session: Session, **kwargs: Any) -> Generator[dict[str, Any], None, None]:
"""
Download files using Curl Impersonate.
https://github.com/lwthiker/curl-impersonate
Yields the following download status updates while chunks are downloading:
- {total: 123} (there are 123 chunks to download)
- {total: None} (there are an unknown number of chunks to download)
- {advance: 1} (one chunk was downloaded)
- {downloaded: "10.1 MB/s"} (currently downloading at a rate of 10.1 MB/s)
- {file_downloaded: Path(...), written: 1024} (download finished, has the save path and size)
The data is in the same format accepted by rich's progress.update() function. The
`downloaded` key is custom and is not natively accepted by all rich progress bars.
Parameters:
url: Web URL of a file to download.
save_path: The path to save the file to. If the save path's directory does not
exist then it will be made automatically.
session: The Requests or Curl-Impersonate Session to make HTTP requests with.
Useful to set Header, Cookie, and Proxy data. Connections are saved and
re-used with the session so long as the server keeps the connection alive.
kwargs: Any extra keyword arguments to pass to the session.get() call. Use this
for one-time request changes like a header, cookie, or proxy. For example,
to request Byte-ranges use e.g., `headers={"Range": "bytes=0-128"}`.
"""
save_dir = save_path.parent
control_file = save_path.with_name(f"{save_path.name}.!dev")
save_dir.mkdir(parents=True, exist_ok=True)
if control_file.exists():
# consider the file corrupt if the control file exists
save_path.unlink(missing_ok=True)
control_file.unlink()
elif save_path.exists():
# if it exists, and no control file, then it should be safe
yield dict(file_downloaded=save_path, written=save_path.stat().st_size)
# TODO: Design a control file format so we know how much of the file is missing
control_file.write_bytes(b"")
attempts = 1
try:
while True:
written = 0
download_sizes = []
last_speed_refresh = time.time()
try:
stream = session.get(url, stream=True, **kwargs)
stream.raise_for_status()
try:
content_length = int(stream.headers.get("Content-Length", "0"))
# Skip Content-Length validation for compressed responses since
# curl_impersonate automatically decompresses but Content-Length shows compressed size
if stream.headers.get("Content-Encoding", "").lower() in ["gzip", "deflate", "br"]:
content_length = 0
except ValueError:
content_length = 0
if content_length > 0:
yield dict(total=math.ceil(content_length / CHUNK_SIZE))
else:
# we have no data to calculate total chunks
yield dict(total=None) # indeterminate mode
with open(save_path, "wb") as f:
for chunk in stream.iter_content(chunk_size=CHUNK_SIZE):
download_size = len(chunk)
f.write(chunk)
written += download_size
yield dict(advance=1)
now = time.time()
time_since = now - last_speed_refresh
download_sizes.append(download_size)
if time_since > PROGRESS_WINDOW or download_size < CHUNK_SIZE:
data_size = sum(download_sizes)
download_speed = math.ceil(data_size / (time_since or 1))
yield dict(downloaded=f"{filesize.decimal(download_speed)}/s")
last_speed_refresh = now
download_sizes.clear()
if content_length and written < content_length:
raise IOError(f"Failed to read {content_length} bytes from the track URI.")
yield dict(file_downloaded=save_path, written=written)
break
except Exception as e:
save_path.unlink(missing_ok=True)
if DOWNLOAD_CANCELLED.is_set() or attempts == MAX_ATTEMPTS:
raise e
time.sleep(RETRY_WAIT)
attempts += 1
finally:
control_file.unlink()
def curl_impersonate(
urls: Union[str, list[str], dict[str, Any], list[dict[str, Any]]],
output_dir: Path,
filename: str,
headers: Optional[MutableMapping[str, Union[str, bytes]]] = None,
cookies: Optional[Union[MutableMapping[str, str], CookieJar]] = None,
proxy: Optional[str] = None,
max_workers: Optional[int] = None,
) -> Generator[dict[str, Any], None, None]:
"""
Download files using Curl Impersonate.
https://github.com/lwthiker/curl-impersonate
Yields the following download status updates while chunks are downloading:
- {total: 123} (there are 123 chunks to download)
- {total: None} (there are an unknown number of chunks to download)
- {advance: 1} (one chunk was downloaded)
- {downloaded: "10.1 MB/s"} (currently downloading at a rate of 10.1 MB/s)
- {file_downloaded: Path(...), written: 1024} (download finished, has the save path and size)
The data is in the same format accepted by rich's progress.update() function.
However, The `downloaded`, `file_downloaded` and `written` keys are custom and not
natively accepted by rich progress bars.
Parameters:
urls: Web URL(s) to file(s) to download. You can use a dictionary with the key
"url" for the URI, and other keys for extra arguments to use per-URL.
output_dir: The folder to save the file into. If the save path's directory does
not exist then it will be made automatically.
filename: The filename or filename template to use for each file. The variables
you can use are `i` for the URL index and `ext` for the URL extension.
headers: A mapping of HTTP Header Key/Values to use for all downloads.
cookies: A mapping of Cookie Key/Values or a Cookie Jar to use for all downloads.
proxy: An optional proxy URI to route connections through for all downloads.
max_workers: The maximum amount of threads to use for downloads. Defaults to
min(32,(cpu_count+4)).
"""
if not urls:
raise ValueError("urls must be provided and not empty")
elif not isinstance(urls, (str, dict, list)):
raise TypeError(f"Expected urls to be {str} or {dict} or a list of one of them, not {type(urls)}")
if not output_dir:
raise ValueError("output_dir must be provided")
elif not isinstance(output_dir, Path):
raise TypeError(f"Expected output_dir to be {Path}, not {type(output_dir)}")
if not filename:
raise ValueError("filename must be provided")
elif not isinstance(filename, str):
raise TypeError(f"Expected filename to be {str}, not {type(filename)}")
if not isinstance(headers, (MutableMapping, type(None))):
raise TypeError(f"Expected headers to be {MutableMapping}, not {type(headers)}")
if not isinstance(cookies, (MutableMapping, CookieJar, type(None))):
raise TypeError(f"Expected cookies to be {MutableMapping} or {CookieJar}, not {type(cookies)}")
if not isinstance(proxy, (str, type(None))):
raise TypeError(f"Expected proxy to be {str}, not {type(proxy)}")
if not isinstance(max_workers, (int, type(None))):
raise TypeError(f"Expected max_workers to be {int}, not {type(max_workers)}")
if not isinstance(urls, list):
urls = [urls]
urls = [
dict(save_path=save_path, **url) if isinstance(url, dict) else dict(url=url, save_path=save_path)
for i, url in enumerate(urls)
for save_path in [
output_dir / filename.format(i=i, ext=get_extension(url["url"] if isinstance(url, dict) else url))
]
]
session = Session(impersonate=BROWSER)
if headers:
headers = {k: v for k, v in headers.items() if k.lower() != "accept-encoding"}
session.headers.update(headers)
if cookies:
session.cookies.update(cookies)
if proxy:
session.proxies.update({"all": proxy})
yield dict(total=len(urls))
download_sizes = []
last_speed_refresh = time.time()
with ThreadPoolExecutor(max_workers=max_workers) as pool:
for i, future in enumerate(
futures.as_completed((pool.submit(download, session=session, **url) for url in urls))
):
file_path, download_size = None, None
try:
for status_update in future.result():
if status_update.get("file_downloaded") and status_update.get("written"):
file_path = status_update["file_downloaded"]
download_size = status_update["written"]
elif len(urls) == 1:
# these are per-chunk updates, only useful if it's one big file
yield status_update
except KeyboardInterrupt:
DOWNLOAD_CANCELLED.set() # skip pending track downloads
yield dict(downloaded="[yellow]CANCELLING")
pool.shutdown(wait=True, cancel_futures=True)
yield dict(downloaded="[yellow]CANCELLED")
# tell dl that it was cancelled
# the pool is already shut down, so exiting loop is fine
raise
except Exception:
DOWNLOAD_CANCELLED.set() # skip pending track downloads
yield dict(downloaded="[red]FAILING")
pool.shutdown(wait=True, cancel_futures=True)
yield dict(downloaded="[red]FAILED")
# tell dl that it failed
# the pool is already shut down, so exiting loop is fine
raise
else:
yield dict(file_downloaded=file_path)
yield dict(advance=1)
now = time.time()
time_since = now - last_speed_refresh
if download_size: # no size == skipped dl
download_sizes.append(download_size)
if download_sizes and (time_since > PROGRESS_WINDOW or i == len(urls)):
data_size = sum(download_sizes)
download_speed = math.ceil(data_size / (time_since or 1))
yield dict(downloaded=f"{filesize.decimal(download_speed)}/s")
last_speed_refresh = now
download_sizes.clear()
__all__ = ("curl_impersonate",)

View File

@ -0,0 +1,385 @@
import os
import re
import subprocess
import warnings
from http.cookiejar import CookieJar
from pathlib import Path
from typing import Any, Generator, MutableMapping
import requests
from requests.cookies import cookiejar_from_dict, get_cookie_header
from unshackle.core import binaries
from unshackle.core.config import config
from unshackle.core.console import console
from unshackle.core.constants import DOWNLOAD_CANCELLED
PERCENT_RE = re.compile(r"(\d+\.\d+%)")
SPEED_RE = re.compile(r"(\d+\.\d+(?:MB|KB)ps)")
SIZE_RE = re.compile(r"(\d+\.\d+(?:MB|GB|KB)/\d+\.\d+(?:MB|GB|KB))")
WARN_RE = re.compile(r"(WARN : Response.*|WARN : One or more errors occurred.*)")
ERROR_RE = re.compile(r"(ERROR.*)")
DECRYPTION_ENGINE = {
"shaka": "SHAKA_PACKAGER",
"mp4decrypt": "MP4DECRYPT",
}
# Ignore FutureWarnings
warnings.simplefilter(action="ignore", category=FutureWarning)
def get_track_selection_args(track: Any) -> list[str]:
"""
Generates track selection arguments for N_m3u8dl_RE.
Args:
track: A track object with attributes like descriptor, data, and class name.
Returns:
A list of strings for track selection.
Raises:
ValueError: If the manifest type is unsupported or track selection fails.
"""
descriptor = track.descriptor.name
track_type = track.__class__.__name__
def _create_args(flag: str, parts: list[str], type_str: str, extra_args: list[str] | None = None) -> list[str]:
if not parts:
raise ValueError(f"[N_m3u8DL-RE]: Unable to select {type_str} track from {descriptor} manifest")
final_args = [flag, ":".join(parts)]
if extra_args:
final_args.extend(extra_args)
return final_args
match descriptor:
case "HLS":
# HLS playlists are direct inputs; no selection arguments needed.
return []
case "DASH":
representation = track.data.get("dash", {}).get("representation", {})
adaptation_set = track.data.get("dash", {}).get("adaptation_set", {})
parts = []
if track_type == "Audio":
if track_id := representation.get("id") or adaptation_set.get("audioTrackId"):
parts.append(rf'"id=\b{track_id}\b"')
else:
if codecs := representation.get("codecs"):
parts.append(f"codecs={codecs}")
if lang := representation.get("lang") or adaptation_set.get("lang"):
parts.append(f"lang={lang}")
if bw := representation.get("bandwidth"):
bitrate = int(bw) // 1000
parts.append(f"bwMin={bitrate}:bwMax={bitrate + 5}")
if roles := representation.findall("Role") + adaptation_set.findall("Role"):
if role := next((r.get("value") for r in roles if r.get("value", "").lower() == "main"), None):
parts.append(f"role={role}")
return _create_args("-sa", parts, "audio")
if track_type == "Video":
if track_id := representation.get("id"):
parts.append(rf'"id=\b{track_id}\b"')
else:
if width := representation.get("width"):
parts.append(f"res={width}*")
if codecs := representation.get("codecs"):
parts.append(f"codecs={codecs}")
if bw := representation.get("bandwidth"):
bitrate = int(bw) // 1000
parts.append(f"bwMin={bitrate}:bwMax={bitrate + 5}")
return _create_args("-sv", parts, "video")
if track_type == "Subtitle":
if track_id := representation.get("id"):
parts.append(rf'"id=\b{track_id}\b"')
else:
if lang := representation.get("lang"):
parts.append(f"lang={lang}")
return _create_args("-ss", parts, "subtitle", extra_args=["--auto-subtitle-fix", "false"])
case "ISM":
quality_level = track.data.get("ism", {}).get("quality_level", {})
stream_index = track.data.get("ism", {}).get("stream_index", {})
parts = []
if track_type == "Audio":
if name := stream_index.get("Name") or quality_level.get("Index"):
parts.append(rf'"id=\b{name}\b"')
else:
if codecs := quality_level.get("FourCC"):
parts.append(f"codecs={codecs}")
if lang := stream_index.get("Language"):
parts.append(f"lang={lang}")
if br := quality_level.get("Bitrate"):
bitrate = int(br) // 1000
parts.append(f"bwMin={bitrate}:bwMax={bitrate + 5}")
return _create_args("-sa", parts, "audio")
if track_type == "Video":
if name := stream_index.get("Name") or quality_level.get("Index"):
parts.append(rf'"id=\b{name}\b"')
else:
if width := quality_level.get("MaxWidth"):
parts.append(f"res={width}*")
if codecs := quality_level.get("FourCC"):
parts.append(f"codecs={codecs}")
if br := quality_level.get("Bitrate"):
bitrate = int(br) // 1000
parts.append(f"bwMin={bitrate}:bwMax={bitrate + 5}")
return _create_args("-sv", parts, "video")
# I've yet to encounter a subtitle track in ISM manifests, so this is mostly theoretical.
if track_type == "Subtitle":
if name := stream_index.get("Name") or quality_level.get("Index"):
parts.append(rf'"id=\b{name}\b"')
else:
if lang := stream_index.get("Language"):
parts.append(f"lang={lang}")
return _create_args("-ss", parts, "subtitle", extra_args=["--auto-subtitle-fix", "false"])
case "URL":
raise ValueError(
f"[N_m3u8DL-RE]: Direct URL downloads are not supported for {track_type} tracks. "
f"The track should use a different downloader (e.g., 'requests', 'aria2c')."
)
raise ValueError(f"[N_m3u8DL-RE]: Unsupported manifest type: {descriptor}")
def build_download_args(
track_url: str,
filename: str,
output_dir: Path,
thread_count: int,
retry_count: int,
track_from_file: Path | None,
custom_args: dict[str, Any] | None,
headers: dict[str, Any] | None,
cookies: CookieJar | None,
proxy: str | None,
content_keys: dict[str, str] | None,
ad_keyword: str | None,
skip_merge: bool | None = False,
) -> list[str]:
"""Constructs the CLI arguments for N_m3u8DL-RE."""
# Default arguments
args = {
"--save-name": filename,
"--save-dir": output_dir,
"--tmp-dir": output_dir,
"--thread-count": thread_count,
"--download-retry-count": retry_count,
"--write-meta-json": False,
"--no-log": True,
}
if proxy:
args["--custom-proxy"] = proxy
if skip_merge:
args["--skip-merge"] = skip_merge
if ad_keyword:
args["--ad-keyword"] = ad_keyword
if content_keys:
args["--key"] = next((f"{kid.hex}:{key.lower()}" for kid, key in content_keys.items()), None)
args["--decryption-engine"] = DECRYPTION_ENGINE.get(config.decryption.lower()) or "SHAKA_PACKAGER"
if custom_args:
args.update(custom_args)
command = [track_from_file or track_url]
for flag, value in args.items():
if value is True:
command.append(flag)
elif value is False:
command.extend([flag, "false"])
elif value is not False and value is not None:
command.extend([flag, str(value)])
if headers:
for key, value in headers.items():
if key.lower() not in ("accept-encoding", "cookie"):
command.extend(["--header", f"{key}: {value}"])
if cookies:
req = requests.Request(method="GET", url=track_url)
cookie_header = get_cookie_header(cookies, req)
command.extend(["--header", f"Cookie: {cookie_header}"])
return command
def download(
urls: str | dict[str, Any] | list[str | dict[str, Any]],
track: Any,
output_dir: Path,
filename: str,
headers: MutableMapping[str, str | bytes] | None,
cookies: MutableMapping[str, str] | CookieJar | None,
proxy: str | None,
max_workers: int | None,
content_keys: dict[str, Any] | None,
skip_merge: bool | None = False,
) -> Generator[dict[str, Any], None, None]:
if not urls:
raise ValueError("urls must be provided and not empty")
if not isinstance(urls, (str, dict, list)):
raise TypeError(f"Expected urls to be str, dict, or list, not {type(urls)}")
if not isinstance(output_dir, Path):
raise TypeError(f"Expected output_dir to be Path, not {type(output_dir)}")
if not isinstance(filename, str) or not filename:
raise ValueError("filename must be a non-empty string")
if not isinstance(headers, (MutableMapping, type(None))):
raise TypeError(f"Expected headers to be a mapping or None, not {type(headers)}")
if not isinstance(cookies, (MutableMapping, CookieJar, type(None))):
raise TypeError(f"Expected cookies to be a mapping, CookieJar, or None, not {type(cookies)}")
if not isinstance(proxy, (str, type(None))):
raise TypeError(f"Expected proxy to be a str or None, not {type(proxy)}")
if not isinstance(max_workers, (int, type(None))):
raise TypeError(f"Expected max_workers to be an int or None, not {type(max_workers)}")
if not isinstance(content_keys, (dict, type(None))):
raise TypeError(f"Expected content_keys to be a dict or None, not {type(content_keys)}")
if not isinstance(skip_merge, (bool, type(None))):
raise TypeError(f"Expected skip_merge to be a bool or None, not {type(skip_merge)}")
if cookies and not isinstance(cookies, CookieJar):
cookies = cookiejar_from_dict(cookies)
if not binaries.N_m3u8DL_RE:
raise EnvironmentError("N_m3u8DL-RE executable not found...")
effective_max_workers = max_workers or min(32, (os.cpu_count() or 1) + 4)
if proxy and not config.n_m3u8dl_re.get("use_proxy", True):
proxy = None
thread_count = config.n_m3u8dl_re.get("thread_count", effective_max_workers)
retry_count = config.n_m3u8dl_re.get("retry_count", 10)
ad_keyword = config.n_m3u8dl_re.get("ad_keyword")
arguments = build_download_args(
track_url=track.url,
track_from_file=track.from_file,
filename=filename,
output_dir=output_dir,
thread_count=thread_count,
retry_count=retry_count,
custom_args=track.downloader_args,
headers=headers,
cookies=cookies,
proxy=proxy,
content_keys=content_keys,
skip_merge=skip_merge,
ad_keyword=ad_keyword,
)
arguments.extend(get_track_selection_args(track))
yield {"total": 100}
yield {"downloaded": "Parsing streams..."}
try:
with subprocess.Popen(
[binaries.N_m3u8DL_RE, *arguments],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
encoding="utf-8",
) as process:
last_line = ""
track_type = track.__class__.__name__
for line in process.stdout:
output = line.strip()
if not output:
continue
last_line = output
if warn_match := WARN_RE.search(output):
console.log(f"{track_type} {warn_match.group(1)}")
continue
if speed_match := SPEED_RE.search(output):
size = size_match.group(1) if (size_match := SIZE_RE.search(output)) else ""
yield {"downloaded": f"{speed_match.group(1)} {size}"}
if percent_match := PERCENT_RE.search(output):
progress = int(percent_match.group(1).split(".", 1)[0])
yield {"completed": progress} if progress < 100 else {"downloaded": "Merging"}
process.wait()
if process.returncode != 0:
if error_match := ERROR_RE.search(last_line):
raise ValueError(f"[N_m3u8DL-RE]: {error_match.group(1)}")
raise subprocess.CalledProcessError(process.returncode, arguments)
except ConnectionResetError:
# interrupted while passing URI to download
raise KeyboardInterrupt()
except KeyboardInterrupt:
DOWNLOAD_CANCELLED.set() # skip pending track downloads
yield {"downloaded": "[yellow]CANCELLED"}
raise
except Exception:
DOWNLOAD_CANCELLED.set() # skip pending track downloads
yield {"downloaded": "[red]FAILED"}
raise
def n_m3u8dl_re(
urls: str | list[str] | dict[str, Any] | list[dict[str, Any]],
track: Any,
output_dir: Path,
filename: str,
headers: MutableMapping[str, str | bytes] | None = None,
cookies: MutableMapping[str, str] | CookieJar | None = None,
proxy: str | None = None,
max_workers: int | None = None,
content_keys: dict[str, Any] | None = None,
skip_merge: bool | None = False,
) -> Generator[dict[str, Any], None, None]:
"""
Download files using N_m3u8DL-RE.
https://github.com/nilaoda/N_m3u8DL-RE
Yields the following download status updates while chunks are downloading:
- {total: 100} (100% download total)
- {completed: 1} (1% download progress out of 100%)
- {downloaded: "10.1 MB/s"} (currently downloading at a rate of 10.1 MB/s)
The data is in the same format accepted by rich's progress.update() function.
Parameters:
urls: Web URL(s) to file(s) to download. NOTE: This parameter is ignored for now.
track: The track to download. Used to get track attributes for the selection
process. Note that Track.Descriptor.URL is not supported by N_m3u8DL-RE.
output_dir: The folder to save the file into. If the save path's directory does
not exist then it will be made automatically.
filename: The filename or filename template to use for each file.
headers: A mapping of HTTP Header Key/Values to use for all downloads.
cookies: A mapping of Cookie Key/Values or a Cookie Jar to use for all downloads.
proxy: A proxy to use for all downloads.
max_workers: The maximum amount of threads to use for downloads. Defaults to
min(32,(cpu_count+4)). Can be set in config with --thread-count option.
content_keys: The content keys to use for decryption.
skip_merge: Whether to skip merging the downloaded chunks.
"""
yield from download(
urls=urls,
track=track,
output_dir=output_dir,
filename=filename,
headers=headers,
cookies=cookies,
proxy=proxy,
max_workers=max_workers,
content_keys=content_keys,
skip_merge=skip_merge,
)
__all__ = ("n_m3u8dl_re",)

View File

@ -0,0 +1,271 @@
import math
import os
import time
from concurrent.futures import as_completed
from concurrent.futures.thread import ThreadPoolExecutor
from http.cookiejar import CookieJar
from pathlib import Path
from typing import Any, Generator, MutableMapping, Optional, Union
from requests import Session
from requests.adapters import HTTPAdapter
from rich import filesize
from unshackle.core.constants import DOWNLOAD_CANCELLED
from unshackle.core.utilities import get_extension
MAX_ATTEMPTS = 5
RETRY_WAIT = 2
CHUNK_SIZE = 1024
PROGRESS_WINDOW = 5
DOWNLOAD_SIZES = []
LAST_SPEED_REFRESH = time.time()
def download(
url: str, save_path: Path, session: Optional[Session] = None, segmented: bool = False, **kwargs: Any
) -> Generator[dict[str, Any], None, None]:
"""
Download a file using Python Requests.
https://requests.readthedocs.io
Yields the following download status updates while chunks are downloading:
- {total: 123} (there are 123 chunks to download)
- {total: None} (there are an unknown number of chunks to download)
- {advance: 1} (one chunk was downloaded)
- {downloaded: "10.1 MB/s"} (currently downloading at a rate of 10.1 MB/s)
- {file_downloaded: Path(...), written: 1024} (download finished, has the save path and size)
The data is in the same format accepted by rich's progress.update() function. The
`downloaded` key is custom and is not natively accepted by all rich progress bars.
Parameters:
url: Web URL of a file to download.
save_path: The path to save the file to. If the save path's directory does not
exist then it will be made automatically.
session: The Requests Session to make HTTP requests with. Useful to set Header,
Cookie, and Proxy data. Connections are saved and re-used with the session
so long as the server keeps the connection alive.
segmented: If downloads are segments or parts of one bigger file.
kwargs: Any extra keyword arguments to pass to the session.get() call. Use this
for one-time request changes like a header, cookie, or proxy. For example,
to request Byte-ranges use e.g., `headers={"Range": "bytes=0-128"}`.
"""
global LAST_SPEED_REFRESH
session = session or Session()
save_dir = save_path.parent
control_file = save_path.with_name(f"{save_path.name}.!dev")
save_dir.mkdir(parents=True, exist_ok=True)
if control_file.exists():
# consider the file corrupt if the control file exists
save_path.unlink(missing_ok=True)
control_file.unlink()
elif save_path.exists():
# if it exists, and no control file, then it should be safe
yield dict(file_downloaded=save_path, written=save_path.stat().st_size)
# TODO: This should return, potential recovery bug
# TODO: Design a control file format so we know how much of the file is missing
control_file.write_bytes(b"")
attempts = 1
try:
while True:
written = 0
# these are for single-url speed calcs only
download_sizes = []
last_speed_refresh = time.time()
try:
stream = session.get(url, stream=True, **kwargs)
stream.raise_for_status()
if not segmented:
try:
content_length = int(stream.headers.get("Content-Length", "0"))
# Skip Content-Length validation for compressed responses since
# requests automatically decompresses but Content-Length shows compressed size
if stream.headers.get("Content-Encoding", "").lower() in ["gzip", "deflate", "br"]:
content_length = 0
except ValueError:
content_length = 0
if content_length > 0:
yield dict(total=math.ceil(content_length / CHUNK_SIZE))
else:
# we have no data to calculate total chunks
yield dict(total=None) # indeterminate mode
with open(save_path, "wb") as f:
for chunk in stream.iter_content(chunk_size=CHUNK_SIZE):
download_size = len(chunk)
f.write(chunk)
written += download_size
if not segmented:
yield dict(advance=1)
now = time.time()
time_since = now - last_speed_refresh
download_sizes.append(download_size)
if time_since > PROGRESS_WINDOW or download_size < CHUNK_SIZE:
data_size = sum(download_sizes)
download_speed = math.ceil(data_size / (time_since or 1))
yield dict(downloaded=f"{filesize.decimal(download_speed)}/s")
last_speed_refresh = now
download_sizes.clear()
if content_length and written < content_length:
raise IOError(f"Failed to read {content_length} bytes from the track URI.")
yield dict(file_downloaded=save_path, written=written)
if segmented:
yield dict(advance=1)
now = time.time()
time_since = now - LAST_SPEED_REFRESH
if written: # no size == skipped dl
DOWNLOAD_SIZES.append(written)
if DOWNLOAD_SIZES and time_since > PROGRESS_WINDOW:
data_size = sum(DOWNLOAD_SIZES)
download_speed = math.ceil(data_size / (time_since or 1))
yield dict(downloaded=f"{filesize.decimal(download_speed)}/s")
LAST_SPEED_REFRESH = now
DOWNLOAD_SIZES.clear()
break
except Exception as e:
save_path.unlink(missing_ok=True)
if DOWNLOAD_CANCELLED.is_set() or attempts == MAX_ATTEMPTS:
raise e
time.sleep(RETRY_WAIT)
attempts += 1
finally:
control_file.unlink()
def requests(
urls: Union[str, list[str], dict[str, Any], list[dict[str, Any]]],
output_dir: Path,
filename: str,
headers: Optional[MutableMapping[str, Union[str, bytes]]] = None,
cookies: Optional[Union[MutableMapping[str, str], CookieJar]] = None,
proxy: Optional[str] = None,
max_workers: Optional[int] = None,
) -> Generator[dict[str, Any], None, None]:
"""
Download a file using Python Requests.
https://requests.readthedocs.io
Yields the following download status updates while chunks are downloading:
- {total: 123} (there are 123 chunks to download)
- {total: None} (there are an unknown number of chunks to download)
- {advance: 1} (one chunk was downloaded)
- {downloaded: "10.1 MB/s"} (currently downloading at a rate of 10.1 MB/s)
- {file_downloaded: Path(...), written: 1024} (download finished, has the save path and size)
The data is in the same format accepted by rich's progress.update() function.
However, The `downloaded`, `file_downloaded` and `written` keys are custom and not
natively accepted by rich progress bars.
Parameters:
urls: Web URL(s) to file(s) to download. You can use a dictionary with the key
"url" for the URI, and other keys for extra arguments to use per-URL.
output_dir: The folder to save the file into. If the save path's directory does
not exist then it will be made automatically.
filename: The filename or filename template to use for each file. The variables
you can use are `i` for the URL index and `ext` for the URL extension.
headers: A mapping of HTTP Header Key/Values to use for all downloads.
cookies: A mapping of Cookie Key/Values or a Cookie Jar to use for all downloads.
proxy: An optional proxy URI to route connections through for all downloads.
max_workers: The maximum amount of threads to use for downloads. Defaults to
min(32,(cpu_count+4)).
"""
if not urls:
raise ValueError("urls must be provided and not empty")
elif not isinstance(urls, (str, dict, list)):
raise TypeError(f"Expected urls to be {str} or {dict} or a list of one of them, not {type(urls)}")
if not output_dir:
raise ValueError("output_dir must be provided")
elif not isinstance(output_dir, Path):
raise TypeError(f"Expected output_dir to be {Path}, not {type(output_dir)}")
if not filename:
raise ValueError("filename must be provided")
elif not isinstance(filename, str):
raise TypeError(f"Expected filename to be {str}, not {type(filename)}")
if not isinstance(headers, (MutableMapping, type(None))):
raise TypeError(f"Expected headers to be {MutableMapping}, not {type(headers)}")
if not isinstance(cookies, (MutableMapping, CookieJar, type(None))):
raise TypeError(f"Expected cookies to be {MutableMapping} or {CookieJar}, not {type(cookies)}")
if not isinstance(proxy, (str, type(None))):
raise TypeError(f"Expected proxy to be {str}, not {type(proxy)}")
if not isinstance(max_workers, (int, type(None))):
raise TypeError(f"Expected max_workers to be {int}, not {type(max_workers)}")
if not isinstance(urls, list):
urls = [urls]
if not max_workers:
max_workers = min(32, (os.cpu_count() or 1) + 4)
urls = [
dict(save_path=save_path, **url) if isinstance(url, dict) else dict(url=url, save_path=save_path)
for i, url in enumerate(urls)
for save_path in [
output_dir / filename.format(i=i, ext=get_extension(url["url"] if isinstance(url, dict) else url))
]
]
session = Session()
session.mount("https://", HTTPAdapter(pool_connections=max_workers, pool_maxsize=max_workers, pool_block=True))
session.mount("http://", session.adapters["https://"])
if headers:
headers = {k: v for k, v in headers.items() if k.lower() != "accept-encoding"}
session.headers.update(headers)
if cookies:
session.cookies.update(cookies)
if proxy:
session.proxies.update({"all": proxy})
yield dict(total=len(urls))
try:
with ThreadPoolExecutor(max_workers=max_workers) as pool:
for future in as_completed(pool.submit(download, session=session, segmented=False, **url) for url in urls):
try:
yield from future.result()
except KeyboardInterrupt:
DOWNLOAD_CANCELLED.set() # skip pending track downloads
yield dict(downloaded="[yellow]CANCELLING")
pool.shutdown(wait=True, cancel_futures=True)
yield dict(downloaded="[yellow]CANCELLED")
# tell dl that it was cancelled
# the pool is already shut down, so exiting loop is fine
raise
except Exception:
DOWNLOAD_CANCELLED.set() # skip pending track downloads
yield dict(downloaded="[red]FAILING")
pool.shutdown(wait=True, cancel_futures=True)
yield dict(downloaded="[red]FAILED")
# tell dl that it failed
# the pool is already shut down, so exiting loop is fine
raise
finally:
DOWNLOAD_SIZES.clear()
__all__ = ("requests",)

View File

@ -0,0 +1,10 @@
from typing import Union
from unshackle.core.drm.clearkey import ClearKey
from unshackle.core.drm.playready import PlayReady
from unshackle.core.drm.widevine import Widevine
DRM_T = Union[ClearKey, Widevine, PlayReady]
__all__ = ("ClearKey", "Widevine", "PlayReady", "DRM_T")

View File

@ -0,0 +1,112 @@
from __future__ import annotations
import base64
import shutil
from pathlib import Path
from typing import Optional, Union
from urllib.parse import urljoin
from Cryptodome.Cipher import AES
from Cryptodome.Util.Padding import unpad
from curl_cffi.requests import Session as CurlSession
from m3u8.model import Key
from requests import Session
class ClearKey:
"""AES Clear Key DRM System."""
def __init__(self, key: Union[bytes, str], iv: Optional[Union[bytes, str]] = None):
"""
Generally IV should be provided where possible. If not provided, it will be
set to \x00 of the same bit-size of the key.
"""
if isinstance(key, str):
key = bytes.fromhex(key.replace("0x", ""))
if not isinstance(key, bytes):
raise ValueError(f"Expected AES Key to be bytes, not {key!r}")
if not iv:
iv = b"\x00"
if isinstance(iv, str):
iv = bytes.fromhex(iv.replace("0x", ""))
if not isinstance(iv, bytes):
raise ValueError(f"Expected IV to be bytes, not {iv!r}")
if len(iv) < len(key):
iv = iv * (len(key) - len(iv) + 1)
self.key: bytes = key
self.iv: bytes = iv
def decrypt(self, path: Path) -> None:
"""Decrypt a Track with AES Clear Key DRM."""
if not path or not path.exists():
raise ValueError("Tried to decrypt a file that does not exist.")
decrypted = AES.new(self.key, AES.MODE_CBC, self.iv).decrypt(path.read_bytes())
try:
decrypted = unpad(decrypted, AES.block_size)
except ValueError:
# the decrypted data is likely already in the block size boundary
pass
decrypted_path = path.with_suffix(f".decrypted{path.suffix}")
decrypted_path.write_bytes(decrypted)
path.unlink()
shutil.move(decrypted_path, path)
@classmethod
def from_m3u_key(cls, m3u_key: Key, session: Optional[Session] = None) -> ClearKey:
"""
Load a ClearKey from an M3U(8) Playlist's EXT-X-KEY.
Parameters:
m3u_key: A Key object parsed from a m3u(8) playlist using
the `m3u8` library.
session: Optional session used to request external URIs with.
Useful to set headers, proxies, cookies, and so forth.
"""
if not isinstance(m3u_key, Key):
raise ValueError(f"Provided M3U Key is in an unexpected type {m3u_key!r}")
if not isinstance(session, (Session, CurlSession, type(None))):
raise TypeError(f"Expected session to be a {Session} or {CurlSession}, not a {type(session)}")
if not m3u_key.method.startswith("AES"):
raise ValueError(f"Provided M3U Key is not an AES Clear Key, {m3u_key.method}")
if not m3u_key.uri:
raise ValueError("No URI in M3U Key, unable to get Key.")
if not session:
session = Session()
if not session.headers.get("User-Agent"):
# commonly needed default for HLS playlists
session.headers["User-Agent"] = "smartexoplayer/1.1.0 (Linux;Android 8.0.0) ExoPlayerLib/2.13.3"
if m3u_key.uri.startswith("data:"):
media_types, data = m3u_key.uri[5:].split(",")
media_types = media_types.split(";")
if "base64" in media_types:
data = base64.b64decode(data)
key = data
else:
url = urljoin(m3u_key.base_uri, m3u_key.uri)
res = session.get(url)
res.raise_for_status()
if not res.content:
raise EOFError("Unexpected Empty Response by M3U Key URI.")
if len(res.content) < 16:
raise EOFError(f"Unexpected Length of Key ({len(res.content)} bytes) in M3U Key.")
key = res.content
if m3u_key.iv:
iv = bytes.fromhex(m3u_key.iv.replace("0x", ""))
else:
iv = None
return cls(key=key, iv=iv)
__all__ = ("ClearKey",)

View File

@ -0,0 +1,442 @@
from __future__ import annotations
import base64
import shutil
import subprocess
import textwrap
from pathlib import Path
from typing import Any, Callable, Optional, Union
from uuid import UUID
import m3u8
from construct import Container
from pymp4.parser import Box
from pyplayready.cdm import Cdm as PlayReadyCdm
from pyplayready.system.pssh import PSSH
from requests import Session
from rich.text import Text
from unshackle.core import binaries
from unshackle.core.config import config
from unshackle.core.console import console
from unshackle.core.constants import AnyTrack
from unshackle.core.utilities import get_boxes
from unshackle.core.utils.subprocess import ffprobe
class PlayReady:
"""PlayReady DRM System."""
def __init__(
self,
pssh: PSSH,
kid: Union[UUID, str, bytes, None] = None,
pssh_b64: Optional[str] = None,
**kwargs: Any,
):
if not pssh:
raise ValueError("Provided PSSH is empty.")
if not isinstance(pssh, PSSH):
raise TypeError(f"Expected pssh to be a {PSSH}, not {pssh!r}")
if pssh_b64:
kids = self._extract_kids_from_pssh_b64(pssh_b64)
else:
kids = []
# Extract KIDs using pyplayready's method (may miss some KIDs)
if not kids:
for header in pssh.wrm_headers:
try:
signed_ids, _, _, _ = header.read_attributes()
except Exception:
continue
for signed_id in signed_ids:
try:
kids.append(UUID(bytes_le=base64.b64decode(signed_id.value)))
except Exception:
continue
if kid:
if isinstance(kid, str):
kid = UUID(hex=kid)
elif isinstance(kid, bytes):
kid = UUID(bytes=kid)
if not isinstance(kid, UUID):
raise ValueError(f"Expected kid to be a {UUID}, str, or bytes, not {kid!r}")
if kid not in kids:
kids.append(kid)
self._pssh = pssh
self._kids = kids
if not self.kids:
raise PlayReady.Exceptions.KIDNotFound("No Key ID was found within PSSH and none were provided.")
self.content_keys: dict[UUID, str] = {}
self.data: dict = kwargs or {}
if pssh_b64:
self.data.setdefault("pssh_b64", pssh_b64)
def _extract_kids_from_pssh_b64(self, pssh_b64: str) -> list[UUID]:
"""Extract all KIDs from base64-encoded PSSH data."""
try:
import xml.etree.ElementTree as ET
# Decode the PSSH
pssh_bytes = base64.b64decode(pssh_b64)
# Try to find XML in the PSSH data
# PlayReady PSSH usually has XML embedded in it
pssh_str = pssh_bytes.decode("utf-16le", errors="ignore")
# Find WRMHEADER
xml_start = pssh_str.find("<WRMHEADER")
if xml_start == -1:
# Try UTF-8
pssh_str = pssh_bytes.decode("utf-8", errors="ignore")
xml_start = pssh_str.find("<WRMHEADER")
if xml_start != -1:
clean_xml = pssh_str[xml_start:]
xml_end = clean_xml.find("</WRMHEADER>") + len("</WRMHEADER>")
clean_xml = clean_xml[:xml_end]
root = ET.fromstring(clean_xml)
ns = {"pr": "http://schemas.microsoft.com/DRM/2007/03/PlayReadyHeader"}
kids = []
# Extract from CUSTOMATTRIBUTES/KIDS
kid_elements = root.findall(".//pr:CUSTOMATTRIBUTES/pr:KIDS/pr:KID", ns)
for kid_elem in kid_elements:
value = kid_elem.get("VALUE")
if value:
try:
kid_bytes = base64.b64decode(value + "==")
kid_uuid = UUID(bytes_le=kid_bytes)
kids.append(kid_uuid)
except Exception:
pass
# Also get individual KID
individual_kids = root.findall(".//pr:DATA/pr:KID", ns)
for kid_elem in individual_kids:
if kid_elem.text:
try:
kid_bytes = base64.b64decode(kid_elem.text.strip() + "==")
kid_uuid = UUID(bytes_le=kid_bytes)
if kid_uuid not in kids:
kids.append(kid_uuid)
except Exception:
pass
return kids
except Exception:
pass
return []
@classmethod
def from_track(cls, track: AnyTrack, session: Optional[Session] = None) -> PlayReady:
if not session:
session = Session()
session.headers.update(config.headers)
kid: Optional[UUID] = None
pssh_boxes: list[Container] = []
tenc_boxes: list[Container] = []
if track.descriptor == track.Descriptor.HLS:
m3u_url = track.url
master = m3u8.loads(session.get(m3u_url).text, uri=m3u_url)
pssh_boxes.extend(
Box.parse(base64.b64decode(x.uri.split(",")[-1]))
for x in (master.session_keys or master.keys)
if x and x.keyformat and "playready" in x.keyformat.lower()
)
init_data = track.get_init_segment(session=session)
if init_data:
probe = ffprobe(init_data)
if probe:
for stream in probe.get("streams") or []:
enc_key_id = stream.get("tags", {}).get("enc_key_id")
if enc_key_id:
kid = UUID(bytes=base64.b64decode(enc_key_id))
pssh_boxes.extend(list(get_boxes(init_data, b"pssh")))
tenc_boxes.extend(list(get_boxes(init_data, b"tenc")))
pssh = next((b for b in pssh_boxes if b.system_ID == PSSH.SYSTEM_ID.bytes), None)
if not pssh:
raise PlayReady.Exceptions.PSSHNotFound("PSSH was not found in track data.")
tenc = next(iter(tenc_boxes), None)
if not kid and tenc and tenc.key_ID.int != 0:
kid = tenc.key_ID
pssh_bytes = Box.build(pssh)
return cls(pssh=PSSH(pssh_bytes), kid=kid, pssh_b64=base64.b64encode(pssh_bytes).decode())
@classmethod
def from_init_data(cls, init_data: bytes) -> PlayReady:
if not init_data:
raise ValueError("Init data should be provided.")
if not isinstance(init_data, bytes):
raise TypeError(f"Expected init data to be bytes, not {init_data!r}")
kid: Optional[UUID] = None
pssh_boxes: list[Container] = list(get_boxes(init_data, b"pssh"))
tenc_boxes: list[Container] = list(get_boxes(init_data, b"tenc"))
probe = ffprobe(init_data)
if probe:
for stream in probe.get("streams") or []:
enc_key_id = stream.get("tags", {}).get("enc_key_id")
if enc_key_id:
kid = UUID(bytes=base64.b64decode(enc_key_id))
pssh = next((b for b in pssh_boxes if b.system_ID == PSSH.SYSTEM_ID.bytes), None)
if not pssh:
raise PlayReady.Exceptions.PSSHNotFound("PSSH was not found in track data.")
tenc = next(iter(tenc_boxes), None)
if not kid and tenc and tenc.key_ID.int != 0:
kid = tenc.key_ID
pssh_bytes = Box.build(pssh)
return cls(pssh=PSSH(pssh_bytes), kid=kid, pssh_b64=base64.b64encode(pssh_bytes).decode())
@property
def pssh(self) -> PSSH:
return self._pssh
@property
def pssh_b64(self) -> Optional[str]:
return self.data.get("pssh_b64")
@property
def kid(self) -> Optional[UUID]:
return next(iter(self.kids), None)
@property
def kids(self) -> list[UUID]:
return self._kids
def _extract_keys_from_cdm(self, cdm: PlayReadyCdm, session_id: bytes) -> dict:
"""Extract keys from CDM session with cross-library compatibility.
Args:
cdm: CDM instance
session_id: Session identifier
Returns:
Dictionary mapping KID UUIDs to hex keys
"""
keys = {}
for key in cdm.get_keys(session_id):
if hasattr(key, "key_id"):
kid = key.key_id
elif hasattr(key, "kid"):
kid = key.kid
else:
continue
if hasattr(key, "key") and hasattr(key.key, "hex"):
key_hex = key.key.hex()
elif hasattr(key, "key") and isinstance(key.key, bytes):
key_hex = key.key.hex()
elif hasattr(key, "key") and isinstance(key.key, str):
key_hex = key.key
else:
continue
keys[kid] = key_hex
return keys
def get_content_keys(self, cdm: PlayReadyCdm, certificate: Callable, licence: Callable) -> None:
session_id = cdm.open()
try:
if hasattr(cdm, "set_pssh_b64") and self.pssh_b64:
cdm.set_pssh_b64(self.pssh_b64)
if hasattr(cdm, "set_required_kids"):
cdm.set_required_kids(self.kids)
challenge = cdm.get_license_challenge(session_id, self.pssh.wrm_headers[0])
if challenge:
try:
license_res = licence(challenge=challenge)
if isinstance(license_res, bytes):
license_str = license_res.decode(errors="ignore")
else:
license_str = str(license_res)
if "<License>" not in license_str:
try:
license_str = base64.b64decode(license_str + "===").decode()
except Exception:
pass
cdm.parse_license(session_id, license_str)
except Exception:
raise
keys = self._extract_keys_from_cdm(cdm, session_id)
self.content_keys.update(keys)
finally:
cdm.close(session_id)
if not self.content_keys:
raise PlayReady.Exceptions.EmptyLicense("No Content Keys were within the License")
def decrypt(self, path: Path) -> None:
"""
Decrypt a Track with PlayReady DRM.
Args:
path: Path to the encrypted file to decrypt
Raises:
EnvironmentError if the required decryption executable could not be found.
ValueError if the track has not yet been downloaded.
SubprocessError if the decryption process returned a non-zero exit code.
"""
if not self.content_keys:
raise ValueError("Cannot decrypt a Track without any Content Keys...")
if not path or not path.exists():
raise ValueError("Tried to decrypt a file that does not exist.")
decrypter = str(getattr(config, "decryption", "")).lower()
if decrypter == "mp4decrypt":
return self._decrypt_with_mp4decrypt(path)
else:
return self._decrypt_with_shaka_packager(path)
def _decrypt_with_mp4decrypt(self, path: Path) -> None:
"""Decrypt using mp4decrypt"""
if not binaries.Mp4decrypt:
raise EnvironmentError("mp4decrypt executable not found but is required.")
output_path = path.with_stem(f"{path.stem}_decrypted")
# Build key arguments
key_args = []
for kid, key in self.content_keys.items():
kid_hex = kid.hex if hasattr(kid, "hex") else str(kid).replace("-", "")
key_hex = key if isinstance(key, str) else key.hex()
key_args.extend(["--key", f"{kid_hex}:{key_hex}"])
cmd = [
str(binaries.Mp4decrypt),
"--show-progress",
*key_args,
str(path),
str(output_path),
]
try:
subprocess.run(cmd, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True, encoding='utf-8')
except subprocess.CalledProcessError as e:
error_msg = e.stderr if e.stderr else f"mp4decrypt failed with exit code {e.returncode}"
raise subprocess.CalledProcessError(e.returncode, cmd, output=e.stdout, stderr=error_msg)
if not output_path.exists():
raise RuntimeError(f"mp4decrypt failed: output file {output_path} was not created")
if output_path.stat().st_size == 0:
raise RuntimeError(f"mp4decrypt failed: output file {output_path} is empty")
path.unlink()
shutil.move(output_path, path)
def _decrypt_with_shaka_packager(self, path: Path) -> None:
"""Decrypt using Shaka Packager (original method)"""
if not binaries.ShakaPackager:
raise EnvironmentError("Shaka Packager executable not found but is required.")
output_path = path.with_stem(f"{path.stem}_decrypted")
config.directories.temp.mkdir(parents=True, exist_ok=True)
try:
arguments = [
f"input={path},stream=0,output={output_path},output_format=MP4",
"--enable_raw_key_decryption",
"--keys",
",".join(
[
*[
f"label={i}:key_id={kid.hex}:key={key.lower()}"
for i, (kid, key) in enumerate(self.content_keys.items())
],
*[
f"label={i}:key_id={'00' * 16}:key={key.lower()}"
for i, (kid, key) in enumerate(self.content_keys.items(), len(self.content_keys))
],
]
),
"--temp_dir",
config.directories.temp,
]
p = subprocess.Popen(
[binaries.ShakaPackager, *arguments],
stdout=subprocess.DEVNULL,
stderr=subprocess.PIPE,
universal_newlines=True,
)
stream_skipped = False
had_error = False
shaka_log_buffer = ""
for line in iter(p.stderr.readline, ""):
line = line.strip()
if not line:
continue
if "Skip stream" in line:
stream_skipped = True
if ":INFO:" in line:
continue
if "I0" in line or "W0" in line:
continue
if ":ERROR:" in line:
had_error = True
if "Insufficient bits in bitstream for given AVC profile" in line:
continue
shaka_log_buffer += f"{line.strip()}\n"
if shaka_log_buffer:
shaka_log_buffer = "\n ".join(
textwrap.wrap(shaka_log_buffer.rstrip(), width=console.width - 22, initial_indent="")
)
console.log(Text.from_ansi("\n[PlayReady]: " + shaka_log_buffer))
p.wait()
if p.returncode != 0 or had_error:
raise subprocess.CalledProcessError(p.returncode, arguments)
path.unlink()
if not stream_skipped:
shutil.move(output_path, path)
except subprocess.CalledProcessError as e:
if e.returncode == 0xC000013A:
raise KeyboardInterrupt()
raise
class Exceptions:
class PSSHNotFound(Exception):
pass
class KIDNotFound(Exception):
pass
class CEKNotFound(Exception):
pass
class EmptyLicense(Exception):
pass
__all__ = ("PlayReady",)

View File

@ -0,0 +1,398 @@
from __future__ import annotations
import base64
import shutil
import subprocess
import textwrap
from pathlib import Path
from typing import Any, Callable, Optional, Union
from uuid import UUID
import m3u8
from construct import Container
from pymp4.parser import Box
from pywidevine.cdm import Cdm as WidevineCdm
from pywidevine.pssh import PSSH
from requests import Session
from rich.text import Text
from unshackle.core import binaries
from unshackle.core.config import config
from unshackle.core.console import console
from unshackle.core.constants import AnyTrack
from unshackle.core.utilities import get_boxes
from unshackle.core.utils.subprocess import ffprobe
class Widevine:
"""Widevine DRM System."""
def __init__(self, pssh: PSSH, kid: Union[UUID, str, bytes, None] = None, **kwargs: Any):
if not pssh:
raise ValueError("Provided PSSH is empty.")
if not isinstance(pssh, PSSH):
raise TypeError(f"Expected pssh to be a {PSSH}, not {pssh!r}")
if pssh.system_id == PSSH.SystemId.PlayReady:
pssh.to_widevine()
if kid:
if isinstance(kid, str):
kid = UUID(hex=kid)
elif isinstance(kid, bytes):
kid = UUID(bytes=kid)
if not isinstance(kid, UUID):
raise ValueError(f"Expected kid to be a {UUID}, str, or bytes, not {kid!r}")
pssh.set_key_ids([kid])
self._pssh = pssh
if not self.kids:
raise Widevine.Exceptions.KIDNotFound("No Key ID was found within PSSH and none were provided.")
self.content_keys: dict[UUID, str] = {}
self.data: dict = kwargs or {}
@classmethod
def from_track(cls, track: AnyTrack, session: Optional[Session] = None) -> Widevine:
"""
Get PSSH and KID from within the Initiation Segment of the Track Data.
It also tries to get PSSH and KID from other track data like M3U8 data
as well as through ffprobe.
Create a Widevine DRM System object from a track's information.
This should only be used if a PSSH could not be provided directly.
It is *rare* to need to use this.
You may provide your own requests session to be able to use custom
headers and more.
Raises:
PSSHNotFound - If the PSSH was not found within the data.
KIDNotFound - If the KID was not found within the data or PSSH.
"""
if not session:
session = Session()
session.headers.update(config.headers)
kid: Optional[UUID] = None
pssh_boxes: list[Container] = []
tenc_boxes: list[Container] = []
if track.descriptor == track.Descriptor.HLS:
m3u_url = track.url
master = m3u8.loads(session.get(m3u_url).text, uri=m3u_url)
pssh_boxes.extend(
Box.parse(base64.b64decode(x.uri.split(",")[-1]))
for x in (master.session_keys or master.keys)
if x and x.keyformat and x.keyformat.lower() == WidevineCdm.urn
)
init_data = track.get_init_segment(session=session)
if init_data:
# try get via ffprobe, needed for non mp4 data e.g. WEBM from Google Play
probe = ffprobe(init_data)
if probe:
for stream in probe.get("streams") or []:
enc_key_id = stream.get("tags", {}).get("enc_key_id")
if enc_key_id:
kid = UUID(bytes=base64.b64decode(enc_key_id))
pssh_boxes.extend(list(get_boxes(init_data, b"pssh")))
tenc_boxes.extend(list(get_boxes(init_data, b"tenc")))
pssh_boxes.sort(key=lambda b: {PSSH.SystemId.Widevine: 0, PSSH.SystemId.PlayReady: 1}[b.system_ID])
pssh = next(iter(pssh_boxes), None)
if not pssh:
raise Widevine.Exceptions.PSSHNotFound("PSSH was not found in track data.")
tenc = next(iter(tenc_boxes), None)
if not kid and tenc and tenc.key_ID.int != 0:
kid = tenc.key_ID
return cls(pssh=PSSH(pssh), kid=kid)
@classmethod
def from_init_data(cls, init_data: bytes) -> Widevine:
"""
Get PSSH and KID from within Initialization Segment Data.
This should only be used if a PSSH could not be provided directly.
It is *rare* to need to use this.
Raises:
PSSHNotFound - If the PSSH was not found within the data.
KIDNotFound - If the KID was not found within the data or PSSH.
"""
if not init_data:
raise ValueError("Init data should be provided.")
if not isinstance(init_data, bytes):
raise TypeError(f"Expected init data to be bytes, not {init_data!r}")
kid: Optional[UUID] = None
pssh_boxes: list[Container] = list(get_boxes(init_data, b"pssh"))
tenc_boxes: list[Container] = list(get_boxes(init_data, b"tenc"))
# try get via ffprobe, needed for non mp4 data e.g. WEBM from Google Play
probe = ffprobe(init_data)
if probe:
for stream in probe.get("streams") or []:
enc_key_id = stream.get("tags", {}).get("enc_key_id")
if enc_key_id:
kid = UUID(bytes=base64.b64decode(enc_key_id))
pssh_boxes.sort(key=lambda b: {PSSH.SystemId.Widevine: 0, PSSH.SystemId.PlayReady: 1}[b.system_ID])
pssh = next(iter(pssh_boxes), None)
if not pssh:
raise Widevine.Exceptions.PSSHNotFound("PSSH was not found in track data.")
tenc = next(iter(tenc_boxes), None)
if not kid and tenc and tenc.key_ID.int != 0:
kid = tenc.key_ID
return cls(pssh=PSSH(pssh), kid=kid)
@property
def pssh(self) -> PSSH:
"""Get Protection System Specific Header Box."""
return self._pssh
@property
def kid(self) -> Optional[UUID]:
"""Get first Key ID, if any."""
return next(iter(self.kids), None)
@property
def kids(self) -> list[UUID]:
"""Get all Key IDs."""
return self._pssh.key_ids
def get_content_keys(self, cdm: WidevineCdm, certificate: Callable, licence: Callable) -> None:
"""
Create a CDM Session and obtain Content Keys for this DRM Instance.
The certificate and license params are expected to be a function and will
be provided with the challenge and session ID.
"""
for kid in self.kids:
if kid in self.content_keys:
continue
session_id = cdm.open()
try:
cert = certificate(challenge=cdm.service_certificate_challenge)
if cert and hasattr(cdm, "set_service_certificate"):
cdm.set_service_certificate(session_id, cert)
if hasattr(cdm, "set_required_kids"):
cdm.set_required_kids(self.kids)
challenge = cdm.get_license_challenge(session_id, self.pssh)
if hasattr(cdm, "has_cached_keys") and cdm.has_cached_keys(session_id):
pass
else:
cdm.parse_license(session_id, licence(challenge=challenge))
self.content_keys = {key.kid: key.key.hex() for key in cdm.get_keys(session_id, "CONTENT")}
if not self.content_keys:
raise Widevine.Exceptions.EmptyLicense("No Content Keys were within the License")
if kid not in self.content_keys:
raise Widevine.Exceptions.CEKNotFound(f"No Content Key for KID {kid.hex} within the License")
finally:
cdm.close(session_id)
def get_NF_content_keys(self, cdm: WidevineCdm, certificate: Callable, licence: Callable) -> None:
"""
Create a CDM Session and obtain Content Keys for this DRM Instance.
The certificate and license params are expected to be a function and will
be provided with the challenge and session ID.
"""
for kid in self.kids:
if kid in self.content_keys:
continue
session_id = cdm.open()
try:
cert = certificate(challenge=cdm.service_certificate_challenge)
if cert and hasattr(cdm, "set_service_certificate"):
cdm.set_service_certificate(session_id, cert)
if hasattr(cdm, "set_required_kids"):
cdm.set_required_kids(self.kids)
challenge = cdm.get_license_challenge(session_id, self.pssh)
if hasattr(cdm, "has_cached_keys") and cdm.has_cached_keys(session_id):
pass
else:
cdm.parse_license(
session_id,
licence(session_id=session_id, challenge=challenge),
)
self.content_keys = {key.kid: key.key.hex() for key in cdm.get_keys(session_id, "CONTENT")}
if not self.content_keys:
raise Widevine.Exceptions.EmptyLicense("No Content Keys were within the License")
if kid not in self.content_keys:
raise Widevine.Exceptions.CEKNotFound(f"No Content Key for KID {kid.hex} within the License")
finally:
cdm.close(session_id)
def decrypt(self, path: Path) -> None:
"""
Decrypt a Track with Widevine DRM.
Args:
path: Path to the encrypted file to decrypt
Raises:
EnvironmentError if the required decryption executable could not be found.
ValueError if the track has not yet been downloaded.
SubprocessError if the decryption process returned a non-zero exit code.
"""
if not self.content_keys:
raise ValueError("Cannot decrypt a Track without any Content Keys...")
if not path or not path.exists():
raise ValueError("Tried to decrypt a file that does not exist.")
decrypter = str(getattr(config, "decryption", "")).lower()
if decrypter == "mp4decrypt":
return self._decrypt_with_mp4decrypt(path)
else:
return self._decrypt_with_shaka_packager(path)
def _decrypt_with_mp4decrypt(self, path: Path) -> None:
"""Decrypt using mp4decrypt"""
if not binaries.Mp4decrypt:
raise EnvironmentError("mp4decrypt executable not found but is required.")
output_path = path.with_stem(f"{path.stem}_decrypted")
# Build key arguments
key_args = []
for kid, key in self.content_keys.items():
kid_hex = kid.hex if hasattr(kid, "hex") else str(kid).replace("-", "")
key_hex = key if isinstance(key, str) else key.hex()
key_args.extend(["--key", f"{kid_hex}:{key_hex}"])
cmd = [
str(binaries.Mp4decrypt),
"--show-progress",
*key_args,
str(path),
str(output_path),
]
try:
subprocess.run(cmd, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True, encoding='utf-8')
except subprocess.CalledProcessError as e:
error_msg = e.stderr if e.stderr else f"mp4decrypt failed with exit code {e.returncode}"
raise subprocess.CalledProcessError(e.returncode, cmd, output=e.stdout, stderr=error_msg)
if not output_path.exists():
raise RuntimeError(f"mp4decrypt failed: output file {output_path} was not created")
if output_path.stat().st_size == 0:
raise RuntimeError(f"mp4decrypt failed: output file {output_path} is empty")
path.unlink()
shutil.move(output_path, path)
def _decrypt_with_shaka_packager(self, path: Path) -> None:
"""Decrypt using Shaka Packager (original method)"""
if not binaries.ShakaPackager:
raise EnvironmentError("Shaka Packager executable not found but is required.")
output_path = path.with_stem(f"{path.stem}_decrypted")
config.directories.temp.mkdir(parents=True, exist_ok=True)
try:
arguments = [
f"input={path},stream=0,output={output_path},output_format=MP4",
"--enable_raw_key_decryption",
"--keys",
",".join(
[
*[
"label={}:key_id={}:key={}".format(i, kid.hex, key.lower())
for i, (kid, key) in enumerate(self.content_keys.items())
],
*[
# some services use a blank KID on the file, but real KID for license server
"label={}:key_id={}:key={}".format(i, "00" * 16, key.lower())
for i, (kid, key) in enumerate(self.content_keys.items(), len(self.content_keys))
],
]
),
"--temp_dir",
config.directories.temp,
]
p = subprocess.Popen(
[binaries.ShakaPackager, *arguments],
stdout=subprocess.DEVNULL,
stderr=subprocess.PIPE,
universal_newlines=True,
)
stream_skipped = False
had_error = False
shaka_log_buffer = ""
for line in iter(p.stderr.readline, ""):
line = line.strip()
if not line:
continue
if "Skip stream" in line:
# file/segment was so small that it didn't have any actual data, ignore
stream_skipped = True
if ":INFO:" in line:
continue
if "I0" in line or "W0" in line:
continue
if ":ERROR:" in line:
had_error = True
if "Insufficient bits in bitstream for given AVC profile" in line:
# this is a warning and is something we don't have to worry about
continue
shaka_log_buffer += f"{line.strip()}\n"
if shaka_log_buffer:
# wrap to console width - padding - '[Widevine]: '
shaka_log_buffer = "\n ".join(
textwrap.wrap(shaka_log_buffer.rstrip(), width=console.width - 22, initial_indent="")
)
console.log(Text.from_ansi("\n[Widevine]: " + shaka_log_buffer))
p.wait()
if p.returncode != 0 or had_error:
raise subprocess.CalledProcessError(p.returncode, arguments)
path.unlink()
if not stream_skipped:
shutil.move(output_path, path)
except subprocess.CalledProcessError as e:
if e.returncode == 0xC000013A: # STATUS_CONTROL_C_EXIT
raise KeyboardInterrupt()
raise
class Exceptions:
class PSSHNotFound(Exception):
"""PSSH (Protection System Specific Header) was not found."""
class KIDNotFound(Exception):
"""KID (Encryption Key ID) was not found."""
class CEKNotFound(Exception):
"""CEK (Content Encryption Key) for KID was not found in License."""
class EmptyLicense(Exception):
"""License returned no Content Encryption Keys."""
__all__ = ("Widevine",)

76
unshackle/core/events.py Normal file
View File

@ -0,0 +1,76 @@
from __future__ import annotations
from copy import deepcopy
from enum import Enum
from typing import Any, Callable
class Events:
class Types(Enum):
_reserved = 0
# A Track's segment has finished downloading
SEGMENT_DOWNLOADED = 1
# Track has finished downloading
TRACK_DOWNLOADED = 2
# Track has finished decrypting
TRACK_DECRYPTED = 3
# Track has finished repacking
TRACK_REPACKED = 4
# Track is about to be Multiplexed into a Container
TRACK_MULTIPLEX = 5
def __init__(self):
self.__subscriptions: dict[Events.Types, list[Callable]] = {}
self.__ephemeral: dict[Events.Types, list[Callable]] = {}
self.reset()
def reset(self):
"""Reset Event Observer clearing all Subscriptions."""
self.__subscriptions = {k: [] for k in Events.Types.__members__.values()}
self.__ephemeral = deepcopy(self.__subscriptions)
def subscribe(self, event_type: Events.Types, callback: Callable, ephemeral: bool = False) -> None:
"""
Subscribe to an Event with a Callback.
Parameters:
event_type: The Events.Type to subscribe to.
callback: The function or lambda to call on event emit.
ephemeral: Unsubscribe the callback from the event on first emit.
Note that this is not thread-safe and may be called multiple
times at roughly the same time.
"""
[self.__subscriptions, self.__ephemeral][ephemeral][event_type].append(callback)
def unsubscribe(self, event_type: Events.Types, callback: Callable) -> None:
"""
Unsubscribe a Callback from an Event.
Parameters:
event_type: The Events.Type to unsubscribe from.
callback: The function or lambda to remove from event emit.
"""
if callback in self.__subscriptions[event_type]:
self.__subscriptions[event_type].remove(callback)
if callback in self.__ephemeral[event_type]:
self.__ephemeral[event_type].remove(callback)
def emit(self, event_type: Events.Types, *args: Any, **kwargs: Any) -> None:
"""
Emit an Event, executing all subscribed Callbacks.
Parameters:
event_type: The Events.Type to emit.
args: Positional arguments to pass to callbacks.
kwargs: Keyword arguments to pass to callbacks.
"""
if event_type not in self.__subscriptions:
raise ValueError(f'Event type "{event_type}" is invalid')
for callback in self.__subscriptions[event_type] + self.__ephemeral[event_type]:
callback(*args, **kwargs)
self.__ephemeral[event_type].clear()
events = Events()

View File

@ -0,0 +1,5 @@
from .dash import DASH
from .hls import HLS
from .ism import ISM
__all__ = ("DASH", "HLS", "ISM")

View File

@ -0,0 +1,807 @@
from __future__ import annotations
import base64
import html
import logging
import math
import re
import sys
from copy import copy
from functools import partial
from pathlib import Path
from typing import Any, Callable, Optional, Union
from urllib.parse import urljoin, urlparse
from uuid import UUID
from zlib import crc32
import requests
from curl_cffi.requests import Session as CurlSession
from langcodes import Language, tag_is_valid
from lxml.etree import Element, ElementTree
from pyplayready.system.pssh import PSSH as PR_PSSH
from pywidevine.cdm import Cdm as WidevineCdm
from pywidevine.pssh import PSSH
from requests import Session
from unshackle.core.constants import DOWNLOAD_CANCELLED, DOWNLOAD_LICENCE_ONLY, AnyTrack
from unshackle.core.downloaders import requests as requests_downloader
from unshackle.core.drm import DRM_T, PlayReady, Widevine
from unshackle.core.events import events
from unshackle.core.tracks import Audio, Subtitle, Tracks, Video
from unshackle.core.utilities import is_close_match, try_ensure_utf8
from unshackle.core.utils.xml import load_xml
class DASH:
def __init__(self, manifest, url: str):
if manifest is None:
raise ValueError("DASH manifest must be provided.")
if manifest.tag != "MPD":
raise TypeError(f"Expected 'MPD' document, but received a '{manifest.tag}' document instead.")
if not url:
raise requests.URLRequired("DASH manifest URL must be provided for relative path computations.")
if not isinstance(url, str):
raise TypeError(f"Expected url to be a {str}, not {url!r}")
self.manifest = manifest
self.url = url
@classmethod
def from_url(cls, url: str, session: Optional[Union[Session, CurlSession]] = None, **args: Any) -> DASH:
if not url:
raise requests.URLRequired("DASH manifest URL must be provided for relative path computations.")
if not isinstance(url, str):
raise TypeError(f"Expected url to be a {str}, not {url!r}")
if not session:
session = Session()
elif not isinstance(session, (Session, CurlSession)):
raise TypeError(f"Expected session to be a {Session} or {CurlSession}, not {session!r}")
res = session.get(url, **args)
if res.url != url:
url = res.url
if not res.ok:
raise requests.ConnectionError("Failed to request the MPD document.", response=res)
return DASH.from_text(res.text, url)
@classmethod
def from_text(cls, text: str, url: str) -> DASH:
if not text:
raise ValueError("DASH manifest Text must be provided.")
if not isinstance(text, str):
raise TypeError(f"Expected text to be a {str}, not {text!r}")
if not url:
raise requests.URLRequired("DASH manifest URL must be provided for relative path computations.")
if not isinstance(url, str):
raise TypeError(f"Expected url to be a {str}, not {url!r}")
manifest = load_xml(text)
return cls(manifest, url)
def to_tracks(
self, language: Optional[Union[str, Language]] = None, period_filter: Optional[Callable] = None
) -> Tracks:
"""
Convert an MPEG-DASH document to Video, Audio and Subtitle Track objects.
Parameters:
language: The Title's Original Recorded Language. It will also be used as a fallback
track language value if the manifest does not list language information.
period_filter: Filter out period's within the manifest.
All Track URLs will be a list of segment URLs.
"""
tracks = Tracks()
for period in self.manifest.findall("Period"):
if callable(period_filter) and period_filter(period):
continue
if next(iter(period.xpath("SegmentType/@value")), "content") != "content":
continue
if "urn:amazon:primevideo:cachingBreadth" in [
x.get("schemeIdUri") for x in period.findall("SupplementalProperty")
]:
continue
for adaptation_set in period.findall("AdaptationSet"):
if self.is_trick_mode(adaptation_set):
# we don't want trick mode streams (they are only used for fast-forward/rewind)
continue
for rep in adaptation_set.findall("Representation"):
get = partial(self._get, adaptation_set=adaptation_set, representation=rep)
findall = partial(self._findall, adaptation_set=adaptation_set, representation=rep, both=True)
segment_base = rep.find("SegmentBase")
codecs = get("codecs")
content_type = get("contentType")
mime_type = get("mimeType")
if not content_type and mime_type:
content_type = mime_type.split("/")[0]
if not content_type and not mime_type:
raise ValueError("Unable to determine the format of a Representation, cannot continue...")
if mime_type == "application/mp4" or content_type == "application":
# likely mp4-boxed subtitles
# TODO: It may not actually be subtitles
try:
real_codec = Subtitle.Codec.from_mime(codecs)
content_type = "text"
mime_type = f"application/mp4; codecs='{real_codec.value.lower()}'"
except ValueError:
raise ValueError(f"Unsupported content type '{content_type}' with codecs of '{codecs}'")
if content_type == "text" and mime_type and "/mp4" not in mime_type:
# mimeType likely specifies the subtitle codec better than `codecs`
codecs = mime_type.split("/")[1]
if content_type == "video":
track_type = Video
track_codec = Video.Codec.from_codecs(codecs)
track_fps = get("frameRate")
if not track_fps and segment_base is not None:
track_fps = segment_base.get("timescale")
track_args = dict(
range_=self.get_video_range(
codecs, findall("SupplementalProperty"), findall("EssentialProperty")
),
bitrate=get("bandwidth") or None,
width=get("width") or 0,
height=get("height") or 0,
fps=track_fps or None,
)
elif content_type == "audio":
track_type = Audio
track_codec = Audio.Codec.from_codecs(codecs)
track_args = dict(
bitrate=get("bandwidth") or None,
channels=next(
iter(
rep.xpath("AudioChannelConfiguration/@value")
or adaptation_set.xpath("AudioChannelConfiguration/@value")
),
None,
),
joc=self.get_ddp_complexity_index(adaptation_set, rep),
descriptive=self.is_descriptive(adaptation_set),
)
elif content_type == "text":
track_type = Subtitle
track_codec = Subtitle.Codec.from_codecs(codecs or "vtt")
track_args = dict(
cc=self.is_closed_caption(adaptation_set),
sdh=self.is_sdh(adaptation_set),
forced=self.is_forced(adaptation_set),
)
elif content_type == "image":
# we don't want what's likely thumbnails for the seekbar
continue
else:
raise ValueError(f"Unknown Track Type '{content_type}'")
track_lang = self.get_language(adaptation_set, rep, fallback=language)
if not track_lang:
msg = "Language information could not be derived from a Representation."
if language is None:
msg += " No fallback language was provided when calling DASH.to_tracks()."
elif not tag_is_valid((str(language) or "").strip()) or str(language).startswith("und"):
msg += f" The fallback language provided is also invalid: {language}"
raise ValueError(msg)
# for some reason it's incredibly common for services to not provide
# a good and actually unique track ID, sometimes because of the lang
# dialect not being represented in the id, or the bitrate, or such.
# this combines all of them as one and hashes it to keep it small(ish).
track_id = hex(
crc32(
"{codec}-{lang}-{bitrate}-{base_url}-{ids}-{track_args}".format(
codec=codecs,
lang=track_lang,
bitrate=get("bitrate"),
base_url=(rep.findtext("BaseURL") or "").split("?")[0],
ids=[get("audioTrackId"), get("id"), period.get("id")],
track_args=track_args,
).encode()
)
)[2:]
tracks.add(
track_type(
id_=track_id,
url=self.url,
codec=track_codec,
language=track_lang,
is_original_lang=bool(language and is_close_match(track_lang, [language])),
descriptor=Video.Descriptor.DASH,
data={
"dash": {
"manifest": self.manifest,
"period": period,
"adaptation_set": adaptation_set,
"representation": rep,
}
},
**track_args,
)
)
# only get tracks from the first main-content period
break
return tracks
@staticmethod
def download_track(
track: AnyTrack,
save_path: Path,
save_dir: Path,
progress: partial,
session: Optional[Session] = None,
proxy: Optional[str] = None,
max_workers: Optional[int] = None,
license_widevine: Optional[Callable] = None,
*,
cdm: Optional[object] = None,
):
if not session:
session = Session()
elif not isinstance(session, (Session, CurlSession)):
raise TypeError(f"Expected session to be a {Session} or {CurlSession}, not {session!r}")
if proxy:
session.proxies.update({"all": proxy})
log = logging.getLogger("DASH")
manifest: ElementTree = track.data["dash"]["manifest"]
period: Element = track.data["dash"]["period"]
adaptation_set: Element = track.data["dash"]["adaptation_set"]
representation: Element = track.data["dash"]["representation"]
# Preserve existing DRM if it was set by the service, especially when service set Widevine
# but manifest only contains PlayReady protection (common scenario for some services)
existing_drm = track.drm
manifest_drm = DASH.get_drm(
representation.findall("ContentProtection") + adaptation_set.findall("ContentProtection")
)
# Only override existing DRM if:
# 1. No existing DRM was set, OR
# 2. Existing DRM contains same type as manifest DRM, OR
# 3. Existing DRM is not Widevine (preserve Widevine when service explicitly set it)
should_override_drm = (
not existing_drm
or (
existing_drm
and manifest_drm
and any(isinstance(existing, type(manifest)) for existing in existing_drm for manifest in manifest_drm)
)
or (existing_drm and not any(isinstance(drm, Widevine) for drm in existing_drm))
)
if should_override_drm:
track.drm = manifest_drm
else:
track.drm = existing_drm
manifest_base_url = manifest.findtext("BaseURL")
if not manifest_base_url:
manifest_base_url = track.url
elif not re.match("^https?://", manifest_base_url, re.IGNORECASE):
manifest_base_url = urljoin(track.url, f"./{manifest_base_url}")
period_base_url = urljoin(manifest_base_url, period.findtext("BaseURL") or "")
adaptation_set_base_url = urljoin(period_base_url, adaptation_set.findtext("BaseURL") or "")
rep_base_url = urljoin(adaptation_set_base_url, representation.findtext("BaseURL") or "")
period_duration = period.get("duration") or manifest.get("mediaPresentationDuration")
init_data: Optional[bytes] = None
segment_template = representation.find("SegmentTemplate")
if segment_template is None:
segment_template = adaptation_set.find("SegmentTemplate")
segment_list = representation.find("SegmentList")
if segment_list is None:
segment_list = adaptation_set.find("SegmentList")
segment_base = representation.find("SegmentBase")
if segment_base is None:
segment_base = adaptation_set.find("SegmentBase")
segments: list[tuple[str, Optional[str]]] = []
segment_timescale: float = 0
segment_durations: list[int] = []
track_kid: Optional[UUID] = None
if segment_template is not None:
segment_template = copy(segment_template)
start_number = int(segment_template.get("startNumber") or 1)
end_number = int(segment_template.get("endNumber") or 0) or None
segment_timeline = segment_template.find("SegmentTimeline")
segment_timescale = float(segment_template.get("timescale") or 1)
for item in ("initialization", "media"):
value = segment_template.get(item)
if not value:
continue
if not re.match("^https?://", value, re.IGNORECASE):
if not rep_base_url:
raise ValueError("Resolved Segment URL is not absolute, and no Base URL is available.")
value = urljoin(rep_base_url, value)
if not urlparse(value).query:
manifest_url_query = urlparse(track.url).query
if manifest_url_query:
value += f"?{manifest_url_query}"
segment_template.set(item, value)
init_url = segment_template.get("initialization")
if init_url:
res = session.get(
DASH.replace_fields(
init_url, Bandwidth=representation.get("bandwidth"), RepresentationID=representation.get("id")
)
)
res.raise_for_status()
init_data = res.content
track_kid = track.get_key_id(init_data)
if segment_timeline is not None:
current_time = 0
for s in segment_timeline.findall("S"):
if s.get("t"):
current_time = int(s.get("t"))
for _ in range(1 + (int(s.get("r") or 0))):
segment_durations.append(current_time)
current_time += int(s.get("d"))
if not end_number:
end_number = len(segment_durations)
for t, n in zip(segment_durations, range(start_number, end_number + 1)):
segments.append(
(
DASH.replace_fields(
segment_template.get("media"),
Bandwidth=representation.get("bandwidth"),
Number=n,
RepresentationID=representation.get("id"),
Time=t,
),
None,
)
)
else:
if not period_duration:
raise ValueError("Duration of the Period was unable to be determined.")
period_duration = DASH.pt_to_sec(period_duration)
segment_duration = float(segment_template.get("duration")) or 1
if not end_number:
segment_count = math.ceil(period_duration / (segment_duration / segment_timescale))
end_number = start_number + segment_count - 1
for s in range(start_number, end_number + 1):
segments.append(
(
DASH.replace_fields(
segment_template.get("media"),
Bandwidth=representation.get("bandwidth"),
Number=s,
RepresentationID=representation.get("id"),
Time=s,
),
None,
)
)
# TODO: Should we floor/ceil/round, or is int() ok?
segment_durations.append(int(segment_duration))
elif segment_list is not None:
segment_timescale = float(segment_list.get("timescale") or 1)
init_data = None
initialization = segment_list.find("Initialization")
if initialization is not None:
source_url = initialization.get("sourceURL")
if not source_url:
source_url = rep_base_url
elif not re.match("^https?://", source_url, re.IGNORECASE):
source_url = urljoin(rep_base_url, f"./{source_url}")
if initialization.get("range"):
init_range_header = {"Range": f"bytes={initialization.get('range')}"}
else:
init_range_header = None
res = session.get(url=source_url, headers=init_range_header)
res.raise_for_status()
init_data = res.content
track_kid = track.get_key_id(init_data)
segment_urls = segment_list.findall("SegmentURL")
for segment_url in segment_urls:
media_url = segment_url.get("media")
if not media_url:
media_url = rep_base_url
elif not re.match("^https?://", media_url, re.IGNORECASE):
media_url = urljoin(rep_base_url, f"./{media_url}")
segments.append((media_url, segment_url.get("mediaRange")))
segment_durations.append(int(segment_url.get("duration") or 1))
elif segment_base is not None:
media_range = None
init_data = None
initialization = segment_base.find("Initialization")
if initialization is not None:
if initialization.get("range"):
init_range_header = {"Range": f"bytes={initialization.get('range')}"}
else:
init_range_header = None
res = session.get(url=rep_base_url, headers=init_range_header)
res.raise_for_status()
init_data = res.content
track_kid = track.get_key_id(init_data)
total_size = res.headers.get("Content-Range", "").split("/")[-1]
if total_size:
media_range = f"{len(init_data)}-{total_size}"
segments.append((rep_base_url, media_range))
elif rep_base_url:
segments.append((rep_base_url, None))
else:
log.error("Could not find a way to get segments from this MPD manifest.")
log.debug(track.url)
sys.exit(1)
# TODO: Should we floor/ceil/round, or is int() ok?
track.data["dash"]["timescale"] = int(segment_timescale)
track.data["dash"]["segment_durations"] = segment_durations
if not track.drm and isinstance(track, (Video, Audio)):
try:
track.drm = [Widevine.from_init_data(init_data)]
except Widevine.Exceptions.PSSHNotFound:
# it might not have Widevine DRM, or might not have found the PSSH
log.warning("No Widevine PSSH was found for this track, is it DRM free?")
if track.drm:
track_kid = track_kid or track.get_key_id(url=segments[0][0], session=session)
drm = track.get_drm_for_cdm(cdm)
if isinstance(drm, (Widevine, PlayReady)):
# license and grab content keys
try:
if not license_widevine:
raise ValueError("license_widevine func must be supplied to use DRM")
progress(downloaded="LICENSING")
license_widevine(drm, track_kid=track_kid)
progress(downloaded="[yellow]LICENSED")
except Exception: # noqa
DOWNLOAD_CANCELLED.set() # skip pending track downloads
progress(downloaded="[red]FAILED")
raise
else:
drm = None
if DOWNLOAD_LICENCE_ONLY.is_set():
progress(downloaded="[yellow]SKIPPED")
return
progress(total=len(segments))
downloader = track.downloader
if downloader.__name__ == "aria2c" and any(bytes_range is not None for url, bytes_range in segments):
# aria2(c) is shit and doesn't support the Range header, fallback to the requests downloader
downloader = requests_downloader
log.warning("Falling back to the requests downloader as aria2(c) doesn't support the Range header")
downloader_args = dict(
urls=[
{"url": url, "headers": {"Range": f"bytes={bytes_range}"} if bytes_range else {}}
for url, bytes_range in segments
],
output_dir=save_dir,
filename="{i:0%d}.mp4" % (len(str(len(segments)))),
headers=session.headers,
cookies=session.cookies,
proxy=proxy,
max_workers=max_workers,
)
if downloader.__name__ == "n_m3u8dl_re":
downloader_args.update({"filename": track.id, "track": track})
for status_update in downloader(**downloader_args):
file_downloaded = status_update.get("file_downloaded")
if file_downloaded:
events.emit(events.Types.SEGMENT_DOWNLOADED, track=track, segment=file_downloaded)
else:
downloaded = status_update.get("downloaded")
if downloaded and downloaded.endswith("/s"):
status_update["downloaded"] = f"DASH {downloaded}"
progress(**status_update)
# see https://github.com/devine-dl/devine/issues/71
for control_file in save_dir.glob("*.aria2__temp"):
control_file.unlink()
segments_to_merge = [x for x in sorted(save_dir.iterdir()) if x.is_file()]
with open(save_path, "wb") as f:
if init_data:
f.write(init_data)
if len(segments_to_merge) > 1:
progress(downloaded="Merging", completed=0, total=len(segments_to_merge))
for segment_file in segments_to_merge:
segment_data = segment_file.read_bytes()
# TODO: fix encoding after decryption?
if (
not drm
and isinstance(track, Subtitle)
and track.codec not in (Subtitle.Codec.fVTT, Subtitle.Codec.fTTML)
):
segment_data = try_ensure_utf8(segment_data)
segment_data = (
segment_data.decode("utf8")
.replace("&lrm;", html.unescape("&lrm;"))
.replace("&rlm;", html.unescape("&rlm;"))
.encode("utf8")
)
f.write(segment_data)
f.flush()
segment_file.unlink()
progress(advance=1)
track.path = save_path
events.emit(events.Types.TRACK_DOWNLOADED, track=track)
if drm:
progress(downloaded="Decrypting", completed=0, total=100)
drm.decrypt(save_path)
track.drm = None
events.emit(events.Types.TRACK_DECRYPTED, track=track, drm=drm, segment=None)
progress(downloaded="Decrypting", advance=100)
save_dir.rmdir()
progress(downloaded="Downloaded")
@staticmethod
def _get(item: str, adaptation_set: Element, representation: Optional[Element] = None) -> Optional[Any]:
"""Helper to get a requested item from the Representation, otherwise from the AdaptationSet."""
adaptation_set_item = adaptation_set.get(item)
if representation is None:
return adaptation_set_item
representation_item = representation.get(item)
if representation_item is not None:
return representation_item
return adaptation_set_item
@staticmethod
def _findall(
item: str, adaptation_set: Element, representation: Optional[Element] = None, both: bool = False
) -> list[Any]:
"""
Helper to get all requested items from the Representation, otherwise from the AdaptationSet.
Optionally, you may pass both=True to keep both values (where available).
"""
adaptation_set_items = adaptation_set.findall(item)
if representation is None:
return adaptation_set_items
representation_items = representation.findall(item)
if both:
return representation_items + adaptation_set_items
if representation_items:
return representation_items
return adaptation_set_items
@staticmethod
def get_language(
adaptation_set: Element,
representation: Optional[Element] = None,
fallback: Optional[Union[str, Language]] = None,
) -> Optional[Language]:
"""
Get Language (if any) from the AdaptationSet or Representation.
A fallback language may be provided if no language information could be
retrieved.
"""
options = []
if representation is not None:
options.append(representation.get("lang"))
# derive language from somewhat common id string format
# the format is typically "{rep_id}_{lang}={bitrate}" or similar
rep_id = representation.get("id")
if rep_id:
m = re.match(r"\w+_(\w+)=\d+", rep_id)
if m:
options.append(m.group(1))
options.append(adaptation_set.get("lang"))
if fallback:
options.append(fallback)
for option in options:
option = (str(option) or "").strip()
if not tag_is_valid(option) or option.startswith("und"):
continue
return Language.get(option)
@staticmethod
def get_video_range(
codecs: str, all_supplemental_props: list[Element], all_essential_props: list[Element]
) -> Video.Range:
if codecs.startswith(("dva1", "dvav", "dvhe", "dvh1")):
return Video.Range.DV
return Video.Range.from_cicp(
primaries=next(
(
int(x.get("value"))
for x in all_supplemental_props + all_essential_props
if x.get("schemeIdUri") == "urn:mpeg:mpegB:cicp:ColourPrimaries"
),
0,
),
transfer=next(
(
int(x.get("value"))
for x in all_supplemental_props + all_essential_props
if x.get("schemeIdUri") == "urn:mpeg:mpegB:cicp:TransferCharacteristics"
),
0,
),
matrix=next(
(
int(x.get("value"))
for x in all_supplemental_props + all_essential_props
if x.get("schemeIdUri") == "urn:mpeg:mpegB:cicp:MatrixCoefficients"
),
0,
),
)
@staticmethod
def is_trick_mode(adaptation_set: Element) -> bool:
"""Check if contents of Adaptation Set is a Trick-Mode stream."""
essential_props = adaptation_set.findall("EssentialProperty")
supplemental_props = adaptation_set.findall("SupplementalProperty")
return any(
prop.get("schemeIdUri") == "http://dashif.org/guidelines/trickmode"
for prop in essential_props + supplemental_props
)
@staticmethod
def is_descriptive(adaptation_set: Element) -> bool:
"""Check if contents of Adaptation Set is Descriptive."""
return any(
(x.get("schemeIdUri"), x.get("value"))
in (("urn:mpeg:dash:role:2011", "descriptive"), ("urn:tva:metadata:cs:AudioPurposeCS:2007", "1"))
for x in adaptation_set.findall("Accessibility")
)
@staticmethod
def is_forced(adaptation_set: Element) -> bool:
"""Check if contents of Adaptation Set is a Forced Subtitle."""
return any(
x.get("schemeIdUri") == "urn:mpeg:dash:role:2011"
and x.get("value") in ("forced-subtitle", "forced_subtitle")
for x in adaptation_set.findall("Role")
)
@staticmethod
def is_sdh(adaptation_set: Element) -> bool:
"""Check if contents of Adaptation Set is for the Hearing Impaired."""
return any(
(x.get("schemeIdUri"), x.get("value")) == ("urn:tva:metadata:cs:AudioPurposeCS:2007", "2")
for x in adaptation_set.findall("Accessibility")
)
@staticmethod
def is_closed_caption(adaptation_set: Element) -> bool:
"""Check if contents of Adaptation Set is a Closed Caption Subtitle."""
return any(
(x.get("schemeIdUri"), x.get("value")) == ("urn:mpeg:dash:role:2011", "caption")
for x in adaptation_set.findall("Role")
)
@staticmethod
def get_ddp_complexity_index(adaptation_set: Element, representation: Optional[Element]) -> Optional[int]:
"""Get the DD+ Complexity Index (if any) from the AdaptationSet or Representation."""
return next(
(
int(x.get("value"))
for x in DASH._findall("SupplementalProperty", adaptation_set, representation, both=True)
if x.get("schemeIdUri") == "tag:dolby.com,2018:dash:EC3_ExtensionComplexityIndex:2018"
),
None,
)
@staticmethod
def get_drm(protections: list[Element]) -> list[DRM_T]:
drm: list[DRM_T] = []
for protection in protections:
urn = (protection.get("schemeIdUri") or "").lower()
if urn == WidevineCdm.urn:
pssh_text = protection.findtext("pssh")
if not pssh_text:
continue
pssh = PSSH(pssh_text)
kid = protection.get("kid")
if kid:
kid = UUID(bytes=base64.b64decode(kid))
default_kid = protection.get("default_KID")
if default_kid:
kid = UUID(default_kid)
if not pssh.key_ids and not kid:
kid = next((UUID(p.get("default_KID")) for p in protections if p.get("default_KID")), None)
drm.append(Widevine(pssh=pssh, kid=kid))
elif urn in ("urn:uuid:9a04f079-9840-4286-ab92-e65be0885f95", "urn:microsoft:playready"):
pr_pssh_b64 = (
protection.findtext("pssh")
or protection.findtext("pro")
or protection.findtext("{urn:microsoft:playready}pro")
)
if not pr_pssh_b64:
continue
pr_pssh = PR_PSSH(pr_pssh_b64)
kid_b64 = protection.findtext("kid")
kid = None
if kid_b64:
try:
kid = UUID(bytes=base64.b64decode(kid_b64))
except Exception:
kid = None
drm.append(PlayReady(pssh=pr_pssh, kid=kid, pssh_b64=pr_pssh_b64))
return drm
@staticmethod
def pt_to_sec(d: Union[str, float]) -> float:
if isinstance(d, float):
return d
has_ymd = d[0:8] == "P0Y0M0DT"
if d[0:2] != "PT" and not has_ymd:
raise ValueError("Input data is not a valid time string.")
if has_ymd:
d = d[6:].upper() # skip `P0Y0M0DT`
else:
d = d[2:].upper() # skip `PT`
m = re.findall(r"([\d.]+.)", d)
return sum(float(x[0:-1]) * {"H": 60 * 60, "M": 60, "S": 1}[x[-1].upper()] for x in m)
@staticmethod
def replace_fields(url: str, **kwargs: Any) -> str:
for field, value in kwargs.items():
url = url.replace(f"${field}$", str(value))
m = re.search(rf"\${re.escape(field)}%([a-z0-9]+)\$", url, flags=re.I)
if m:
url = url.replace(m.group(), f"{value:{m.group(1)}}")
return url
__all__ = ("DASH",)

View File

@ -0,0 +1,882 @@
from __future__ import annotations
import base64
import html
import json
import logging
import os
import shutil
import subprocess
import sys
from functools import partial
from pathlib import Path
from typing import Any, Callable, Optional, Union
from urllib.parse import urljoin
from zlib import crc32
import m3u8
import requests
from curl_cffi.requests import Response as CurlResponse
from curl_cffi.requests import Session as CurlSession
from langcodes import Language, tag_is_valid
from m3u8 import M3U8
from pyplayready.cdm import Cdm as PlayReadyCdm
from pyplayready.system.pssh import PSSH as PR_PSSH
from pywidevine.cdm import Cdm as WidevineCdm
from pywidevine.pssh import PSSH as WV_PSSH
from requests import Session
from unshackle.core import binaries
from unshackle.core.constants import DOWNLOAD_CANCELLED, DOWNLOAD_LICENCE_ONLY, AnyTrack
from unshackle.core.downloaders import requests as requests_downloader
from unshackle.core.drm import DRM_T, ClearKey, PlayReady, Widevine
from unshackle.core.events import events
from unshackle.core.tracks import Audio, Subtitle, Tracks, Video
from unshackle.core.utilities import get_extension, is_close_match, try_ensure_utf8
class HLS:
def __init__(self, manifest: M3U8, session: Optional[Union[Session, CurlSession]] = None):
if not manifest:
raise ValueError("HLS manifest must be provided.")
if not isinstance(manifest, M3U8):
raise TypeError(f"Expected manifest to be a {M3U8}, not {manifest!r}")
if not manifest.is_variant:
raise ValueError("Expected the M3U(8) manifest to be a Variant Playlist.")
self.manifest = manifest
self.session = session or Session()
@classmethod
def from_url(cls, url: str, session: Optional[Union[Session, CurlSession]] = None, **args: Any) -> HLS:
if not url:
raise requests.URLRequired("HLS manifest URL must be provided.")
if not isinstance(url, str):
raise TypeError(f"Expected url to be a {str}, not {url!r}")
if not session:
session = Session()
elif not isinstance(session, (Session, CurlSession)):
raise TypeError(f"Expected session to be a {Session} or {CurlSession}, not {session!r}")
res = session.get(url, **args)
# Handle requests and curl_cffi response objects
if isinstance(res, requests.Response):
if not res.ok:
raise requests.ConnectionError("Failed to request the M3U(8) document.", response=res)
content = res.text
elif isinstance(res, CurlResponse):
if not res.ok:
raise requests.ConnectionError("Failed to request the M3U(8) document.", response=res)
content = res.text
else:
raise TypeError(f"Expected response to be a requests.Response or curl_cffi.Response, not {type(res)}")
master = m3u8.loads(content, uri=url)
return cls(master, session)
@classmethod
def from_text(cls, text: str, url: str) -> HLS:
if not text:
raise ValueError("HLS manifest Text must be provided.")
if not isinstance(text, str):
raise TypeError(f"Expected text to be a {str}, not {text!r}")
if not url:
raise requests.URLRequired("HLS manifest URL must be provided for relative path computations.")
if not isinstance(url, str):
raise TypeError(f"Expected url to be a {str}, not {url!r}")
master = m3u8.loads(text, uri=url)
return cls(master)
def to_tracks(self, language: Union[str, Language]) -> Tracks:
"""
Convert a Variant Playlist M3U(8) document to Video, Audio and Subtitle Track objects.
Parameters:
language: Language you expect the Primary Track to be in.
All Track objects' URL will be to another M3U(8) document. However, these documents
will be Invariant Playlists and contain the list of segments URIs among other metadata.
"""
session_keys = list(self.manifest.session_keys or [])
if not session_keys:
session_keys = HLS.parse_session_data_keys(self.manifest, self.session)
session_drm = HLS.get_all_drm(session_keys)
audio_codecs_by_group_id: dict[str, Audio.Codec] = {}
tracks = Tracks()
for playlist in self.manifest.playlists:
audio_group = playlist.stream_info.audio
if audio_group:
audio_codec = Audio.Codec.from_codecs(playlist.stream_info.codecs)
audio_codecs_by_group_id[audio_group] = audio_codec
try:
# TODO: Any better way to figure out the primary track type?
if playlist.stream_info.codecs:
Video.Codec.from_codecs(playlist.stream_info.codecs)
except ValueError:
primary_track_type = Audio
else:
primary_track_type = Video
tracks.add(
primary_track_type(
id_=hex(crc32(str(playlist).encode()))[2:],
url=urljoin(playlist.base_uri, playlist.uri),
codec=(
primary_track_type.Codec.from_codecs(playlist.stream_info.codecs)
if playlist.stream_info.codecs
else None
),
language=language, # HLS manifests do not seem to have language info
is_original_lang=True, # TODO: All we can do is assume Yes
bitrate=playlist.stream_info.average_bandwidth or playlist.stream_info.bandwidth,
descriptor=Video.Descriptor.HLS,
drm=session_drm,
data={"hls": {"playlist": playlist}},
# video track args
**(
dict(
range_=Video.Range.DV
if any(
codec.split(".")[0] in ("dva1", "dvav", "dvhe", "dvh1")
for codec in (playlist.stream_info.codecs or "").lower().split(",")
)
else Video.Range.from_m3u_range_tag(playlist.stream_info.video_range),
width=playlist.stream_info.resolution[0] if playlist.stream_info.resolution else None,
height=playlist.stream_info.resolution[1] if playlist.stream_info.resolution else None,
fps=playlist.stream_info.frame_rate,
)
if primary_track_type is Video
else {}
),
)
)
for media in self.manifest.media:
if not media.uri:
continue
joc = 0
if media.type == "AUDIO":
track_type = Audio
codec = audio_codecs_by_group_id.get(media.group_id)
if media.channels and media.channels.endswith("/JOC"):
joc = int(media.channels.split("/JOC")[0])
media.channels = "5.1"
else:
track_type = Subtitle
codec = Subtitle.Codec.WebVTT # assuming WebVTT, codec info isn't shown
track_lang = next(
(
Language.get(option)
for x in (media.language, language)
for option in [(str(x) or "").strip()]
if tag_is_valid(option) and not option.startswith("und")
),
None,
)
if not track_lang:
msg = "Language information could not be derived for a media."
if language is None:
msg += " No fallback language was provided when calling HLS.to_tracks()."
elif not tag_is_valid((str(language) or "").strip()) or str(language).startswith("und"):
msg += f" The fallback language provided is also invalid: {language}"
raise ValueError(msg)
tracks.add(
track_type(
id_=hex(crc32(str(media).encode()))[2:],
url=urljoin(media.base_uri, media.uri),
codec=codec,
language=track_lang, # HLS media may not have language info, fallback if needed
is_original_lang=bool(language and is_close_match(track_lang, [language])),
descriptor=Audio.Descriptor.HLS,
drm=session_drm if media.type == "AUDIO" else None,
data={"hls": {"media": media}},
# audio track args
**(
dict(
bitrate=0, # TODO: M3U doesn't seem to state bitrate?
channels=media.channels,
joc=joc,
descriptive="public.accessibility.describes-video" in (media.characteristics or ""),
)
if track_type is Audio
else dict(
forced=media.forced == "YES",
sdh="public.accessibility.describes-music-and-sound" in (media.characteristics or ""),
)
if track_type is Subtitle
else {}
),
)
)
return tracks
@staticmethod
def download_track(
track: AnyTrack,
save_path: Path,
save_dir: Path,
progress: partial,
session: Optional[Union[Session, CurlSession]] = None,
proxy: Optional[str] = None,
max_workers: Optional[int] = None,
license_widevine: Optional[Callable] = None,
*,
cdm: Optional[object] = None,
) -> None:
if not session:
session = Session()
elif not isinstance(session, (Session, CurlSession)):
raise TypeError(f"Expected session to be a {Session} or {CurlSession}, not {session!r}")
if proxy:
# Handle proxies differently based on session type
if isinstance(session, Session):
session.proxies.update({"all": proxy})
log = logging.getLogger("HLS")
if track.from_file:
master = m3u8.load(str(track.from_file))
else:
# Get the playlist text and handle both session types
response = session.get(track.url)
if isinstance(response, requests.Response):
if not response.ok:
log.error(f"Failed to request the invariant M3U8 playlist: {response.status_code}")
sys.exit(1)
playlist_text = response.text
else:
raise TypeError(f"Expected response to be a requests.Response or curl_cffi.Response, not {type(response)}")
master = m3u8.loads(playlist_text, uri=track.url)
if not master.segments:
log.error("Track's HLS playlist has no segments, expecting an invariant M3U8 playlist.")
sys.exit(1)
if track.drm:
session_drm = track.get_drm_for_cdm(cdm)
if isinstance(session_drm, (Widevine, PlayReady)):
# license and grab content keys
try:
if not license_widevine:
raise ValueError("license_widevine func must be supplied to use DRM")
progress(downloaded="LICENSING")
license_widevine(session_drm)
progress(downloaded="[yellow]LICENSED")
except Exception: # noqa
DOWNLOAD_CANCELLED.set() # skip pending track downloads
progress(downloaded="[red]FAILED")
raise
else:
session_drm = None
if DOWNLOAD_LICENCE_ONLY.is_set():
progress(downloaded="[yellow]SKIPPED")
return
unwanted_segments = [
segment for segment in master.segments if callable(track.OnSegmentFilter) and track.OnSegmentFilter(segment)
]
total_segments = len(master.segments) - len(unwanted_segments)
progress(total=total_segments)
downloader = track.downloader
if downloader.__name__ == "aria2c" and any(x.byterange for x in master.segments if x not in unwanted_segments):
downloader = requests_downloader
log.warning("Falling back to the requests downloader as aria2(c) doesn't support the Range header")
urls: list[dict[str, Any]] = []
segment_durations: list[int] = []
range_offset = 0
for segment in master.segments:
if segment in unwanted_segments:
continue
segment_durations.append(int(segment.duration))
if segment.byterange:
byte_range = HLS.calculate_byte_range(segment.byterange, range_offset)
range_offset = int(byte_range.split("-")[0])
else:
byte_range = None
urls.append(
{
"url": urljoin(segment.base_uri, segment.uri),
"headers": {"Range": f"bytes={byte_range}"} if byte_range else {},
}
)
track.data["hls"]["segment_durations"] = segment_durations
segment_save_dir = save_dir / "segments"
skip_merge = False
downloader_args = dict(
urls=urls,
output_dir=segment_save_dir,
filename="{i:0%d}{ext}" % len(str(len(urls))),
headers=session.headers,
cookies=session.cookies,
proxy=proxy,
max_workers=max_workers,
)
if downloader.__name__ == "n_m3u8dl_re":
skip_merge = True
downloader_args.update(
{
"output_dir": save_dir,
"filename": track.id,
"track": track,
"content_keys": session_drm.content_keys if session_drm else None,
}
)
for status_update in downloader(**downloader_args):
file_downloaded = status_update.get("file_downloaded")
if file_downloaded:
events.emit(events.Types.SEGMENT_DOWNLOADED, track=track, segment=file_downloaded)
else:
downloaded = status_update.get("downloaded")
if downloaded and downloaded.endswith("/s"):
status_update["downloaded"] = f"HLS {downloaded}"
progress(**status_update)
# see https://github.com/devine-dl/devine/issues/71
for control_file in segment_save_dir.glob("*.aria2__temp"):
control_file.unlink()
if not skip_merge:
progress(total=total_segments, completed=0, downloaded="Merging")
name_len = len(str(total_segments))
discon_i = 0
range_offset = 0
map_data: Optional[tuple[m3u8.model.InitializationSection, bytes]] = None
if session_drm:
encryption_data: Optional[tuple[Optional[m3u8.Key], DRM_T]] = (None, session_drm)
else:
encryption_data: Optional[tuple[Optional[m3u8.Key], DRM_T]] = None
i = -1
for real_i, segment in enumerate(master.segments):
if segment not in unwanted_segments:
i += 1
is_last_segment = (real_i + 1) == len(master.segments)
def merge(to: Path, via: list[Path], delete: bool = False, include_map_data: bool = False):
"""
Merge all files to a given path, optionally including map data.
Parameters:
to: The output file with all merged data.
via: List of files to merge, in sequence.
delete: Delete the file once it's been merged.
include_map_data: Whether to include the init map data.
"""
with open(to, "wb") as x:
if include_map_data and map_data and map_data[1]:
x.write(map_data[1])
for file in via:
x.write(file.read_bytes())
x.flush()
if delete:
file.unlink()
def decrypt(include_this_segment: bool) -> Path:
"""
Decrypt all segments that uses the currently set DRM.
All segments that will be decrypted with this DRM will be merged together
in sequence, prefixed with the init data (if any), and then deleted. Once
merged they will be decrypted. The merged and decrypted file names state
the range of segments that were used.
Parameters:
include_this_segment: Whether to include the current segment in the
list of segments to merge and decrypt. This should be False if
decrypting on EXT-X-KEY changes, or True when decrypting on the
last segment.
Returns the decrypted path.
"""
drm = encryption_data[1]
first_segment_i = next(
int(file.stem) for file in sorted(segment_save_dir.iterdir()) if file.stem.isdigit()
)
last_segment_i = max(0, i - int(not include_this_segment))
range_len = (last_segment_i - first_segment_i) + 1
segment_range = f"{str(first_segment_i).zfill(name_len)}-{str(last_segment_i).zfill(name_len)}"
merged_path = (
segment_save_dir / f"{segment_range}{get_extension(master.segments[last_segment_i].uri)}"
)
decrypted_path = segment_save_dir / f"{merged_path.stem}_decrypted{merged_path.suffix}"
files = [
file
for file in sorted(segment_save_dir.iterdir())
if file.stem.isdigit() and first_segment_i <= int(file.stem) <= last_segment_i
]
if not files:
raise ValueError(f"None of the segment files for {segment_range} exist...")
elif len(files) != range_len:
raise ValueError(f"Missing {range_len - len(files)} segment files for {segment_range}...")
if isinstance(drm, (Widevine, PlayReady)):
# with widevine we can merge all segments and decrypt once
merge(to=merged_path, via=files, delete=True, include_map_data=True)
drm.decrypt(merged_path)
merged_path.rename(decrypted_path)
else:
# with other drm we must decrypt separately and then merge them
# for aes this is because each segment likely has 16-byte padding
for file in files:
drm.decrypt(file)
merge(to=merged_path, via=files, delete=True, include_map_data=True)
events.emit(events.Types.TRACK_DECRYPTED, track=track, drm=drm, segment=decrypted_path)
return decrypted_path
def merge_discontinuity(include_this_segment: bool, include_map_data: bool = True):
"""
Merge all segments of the discontinuity.
All segment files for this discontinuity must already be downloaded and
already decrypted (if it needs to be decrypted).
Parameters:
include_this_segment: Whether to include the current segment in the
list of segments to merge and decrypt. This should be False if
decrypting on EXT-X-KEY changes, or True when decrypting on the
last segment.
include_map_data: Whether to prepend the init map data before the
segment files when merging.
"""
last_segment_i = max(0, i - int(not include_this_segment))
files = [
file
for file in sorted(segment_save_dir.iterdir())
if int(file.stem.replace("_decrypted", "").split("-")[-1]) <= last_segment_i
]
if files:
to_dir = segment_save_dir.parent
to_path = to_dir / f"{str(discon_i).zfill(name_len)}{files[-1].suffix}"
merge(to=to_path, via=files, delete=True, include_map_data=include_map_data)
if segment not in unwanted_segments:
if isinstance(track, Subtitle):
segment_file_ext = get_extension(segment.uri)
segment_file_path = segment_save_dir / f"{str(i).zfill(name_len)}{segment_file_ext}"
segment_data = try_ensure_utf8(segment_file_path.read_bytes())
if track.codec not in (Subtitle.Codec.fVTT, Subtitle.Codec.fTTML):
segment_data = (
segment_data.decode("utf8")
.replace("&lrm;", html.unescape("&lrm;"))
.replace("&rlm;", html.unescape("&rlm;"))
.encode("utf8")
)
segment_file_path.write_bytes(segment_data)
if segment.discontinuity and i != 0:
if encryption_data:
decrypt(include_this_segment=False)
merge_discontinuity(
include_this_segment=False, include_map_data=not encryption_data or not encryption_data[1]
)
discon_i += 1
range_offset = 0 # TODO: Should this be reset or not?
map_data = None
if encryption_data:
encryption_data = (encryption_data[0], encryption_data[1])
if segment.init_section and (not map_data or segment.init_section != map_data[0]):
if segment.init_section.byterange:
init_byte_range = HLS.calculate_byte_range(segment.init_section.byterange, range_offset)
range_offset = init_byte_range.split("-")[0]
init_range_header = {"Range": f"bytes={init_byte_range}"}
else:
init_range_header = {}
# Handle both session types for init section request
res = session.get(
url=urljoin(segment.init_section.base_uri, segment.init_section.uri),
headers=init_range_header,
)
# Check response based on session type
if isinstance(res, requests.Response):
res.raise_for_status()
init_content = res.content
else:
raise TypeError(
f"Expected response to be requests.Response or curl_cffi.Response, not {type(res)}"
)
map_data = (segment.init_section, init_content)
segment_keys = getattr(segment, "keys", None)
if segment_keys:
key = HLS.get_supported_key(segment_keys)
if encryption_data and encryption_data[0] != key and i != 0 and segment not in unwanted_segments:
decrypt(include_this_segment=False)
if key is None:
encryption_data = None
elif not encryption_data or encryption_data[0] != key:
drm = HLS.get_drm(key, session)
if isinstance(drm, (Widevine, PlayReady)):
try:
if map_data:
track_kid = track.get_key_id(map_data[1])
else:
track_kid = None
progress(downloaded="LICENSING")
license_widevine(drm, track_kid=track_kid)
progress(downloaded="[yellow]LICENSED")
except Exception: # noqa
DOWNLOAD_CANCELLED.set() # skip pending track downloads
progress(downloaded="[red]FAILED")
raise
encryption_data = (key, drm)
if DOWNLOAD_LICENCE_ONLY.is_set():
continue
if is_last_segment:
# required as it won't end with EXT-X-DISCONTINUITY nor a new key
if encryption_data:
decrypt(include_this_segment=True)
merge_discontinuity(
include_this_segment=True, include_map_data=not encryption_data or not encryption_data[1]
)
progress(advance=1)
if DOWNLOAD_LICENCE_ONLY.is_set():
return
def find_segments_recursively(directory: Path) -> list[Path]:
"""Find all segment files recursively in any directory structure created by downloaders."""
segments = []
# First check direct files in the directory
if directory.exists():
segments.extend([x for x in directory.iterdir() if x.is_file()])
# If no direct files, recursively search subdirectories
if not segments:
for subdir in directory.iterdir():
if subdir.is_dir():
segments.extend(find_segments_recursively(subdir))
return sorted(segments)
# finally merge all the discontinuity save files together to the final path
segments_to_merge = find_segments_recursively(save_dir)
if len(segments_to_merge) == 1:
shutil.move(segments_to_merge[0], save_path)
else:
progress(downloaded="Merging")
if isinstance(track, (Video, Audio)):
HLS.merge_segments(segments=segments_to_merge, save_path=save_path)
else:
with open(save_path, "wb") as f:
for discontinuity_file in segments_to_merge:
discontinuity_data = discontinuity_file.read_bytes()
f.write(discontinuity_data)
f.flush()
os.fsync(f.fileno())
discontinuity_file.unlink()
# Clean up empty segment directory
if save_dir.exists() and save_dir.name.endswith("_segments"):
try:
save_dir.rmdir()
except OSError:
# Directory might not be empty, try removing recursively
shutil.rmtree(save_dir, ignore_errors=True)
progress(downloaded="Downloaded")
track.path = save_path
events.emit(events.Types.TRACK_DOWNLOADED, track=track)
@staticmethod
def merge_segments(segments: list[Path], save_path: Path) -> int:
"""
Concatenate Segments using FFmpeg concat with binary fallback.
Returns the file size of the merged file.
"""
# Track segment directories for cleanup
segment_dirs = set()
for segment in segments:
# Track all parent directories that contain segments
current_dir = segment.parent
while current_dir.name and "_segments" in str(current_dir):
segment_dirs.add(current_dir)
current_dir = current_dir.parent
def cleanup_segments_and_dirs():
"""Clean up segments and directories after successful merge."""
for segment in segments:
segment.unlink(missing_ok=True)
for segment_dir in segment_dirs:
if segment_dir.exists():
try:
shutil.rmtree(segment_dir)
except OSError:
pass # Directory cleanup failed, but merge succeeded
# Try FFmpeg concat first (preferred method)
if binaries.FFMPEG:
try:
demuxer_file = save_path.parent / f"ffmpeg_concat_demuxer_{save_path.stem}.txt"
demuxer_file.write_text("\n".join([f"file '{segment.absolute()}'" for segment in segments]))
subprocess.check_call(
[
binaries.FFMPEG,
"-hide_banner",
"-loglevel",
"error",
"-f",
"concat",
"-safe",
"0",
"-i",
demuxer_file,
"-map",
"0",
"-c",
"copy",
save_path,
],
timeout=300, # 5 minute timeout
)
demuxer_file.unlink(missing_ok=True)
cleanup_segments_and_dirs()
return save_path.stat().st_size
except (subprocess.CalledProcessError, subprocess.TimeoutExpired, OSError) as e:
# FFmpeg failed, clean up demuxer file and fall back to binary concat
logging.getLogger("HLS").debug(f"FFmpeg concat failed ({e}), falling back to binary concatenation")
demuxer_file.unlink(missing_ok=True)
# Remove partial output file if it exists
save_path.unlink(missing_ok=True)
# Fallback: Binary concatenation
logging.getLogger("HLS").debug(f"Using binary concatenation for {len(segments)} segments")
with open(save_path, "wb") as output_file:
for segment in segments:
with open(segment, "rb") as segment_file:
output_file.write(segment_file.read())
cleanup_segments_and_dirs()
return save_path.stat().st_size
@staticmethod
def parse_session_data_keys(
manifest: M3U8, session: Optional[Union[Session, CurlSession]] = None
) -> list[m3u8.model.Key]:
"""Parse `com.apple.hls.keys` session data and return Key objects."""
keys: list[m3u8.model.Key] = []
for data in getattr(manifest, "session_data", []) or []:
if getattr(data, "data_id", None) != "com.apple.hls.keys":
continue
value = getattr(data, "value", None)
if not value and data.uri:
if not session:
session = Session()
res = session.get(urljoin(manifest.base_uri or "", data.uri))
value = res.text
if not value:
continue
try:
decoded = base64.b64decode(value).decode()
except Exception:
decoded = value
try:
items = json.loads(decoded)
except Exception:
continue
for item in items if isinstance(items, list) else []:
if not isinstance(item, dict):
continue
key = m3u8.model.Key(
method=item.get("method"),
base_uri=manifest.base_uri or "",
uri=item.get("uri"),
keyformat=item.get("keyformat"),
keyformatversions=",".join(item.get("keyformatversion") or item.get("keyformatversions") or []),
)
if key.method in {"AES-128", "ISO-23001-7"} or (
key.keyformat
and key.keyformat.lower()
in {
WidevineCdm.urn,
PlayReadyCdm,
"com.microsoft.playready",
}
):
keys.append(key)
return keys
@staticmethod
def get_supported_key(keys: list[Union[m3u8.model.SessionKey, m3u8.model.Key]]) -> Optional[m3u8.Key]:
"""
Get a support Key System from a list of Key systems.
Note that the key systems are chosen in an opinionated order.
Returns None if one of the key systems is method=NONE, which means all segments
from hence forth should be treated as plain text until another key system is
encountered, unless it's also method=NONE.
Raises NotImplementedError if none of the key systems are supported.
"""
if any(key.method == "NONE" for key in keys):
return None
unsupported_systems = []
for key in keys:
if not key:
continue
# TODO: Add a way to specify which supported key system to use
# TODO: Add support for 'SAMPLE-AES', 'AES-CTR', 'AES-CBC', 'ClearKey'
elif key.method == "AES-128":
return key
elif key.method == "ISO-23001-7":
return key
elif key.keyformat and key.keyformat.lower() == WidevineCdm.urn:
return key
elif key.keyformat and (
key.keyformat.lower() == PlayReadyCdm or "com.microsoft.playready" in key.keyformat.lower()
):
return key
else:
unsupported_systems.append(key.method + (f" ({key.keyformat})" if key.keyformat else ""))
else:
raise NotImplementedError(f"None of the key systems are supported: {', '.join(unsupported_systems)}")
@staticmethod
def get_drm(
key: Union[m3u8.model.SessionKey, m3u8.model.Key],
session: Optional[Union[Session, CurlSession]] = None,
) -> DRM_T:
"""
Convert HLS EXT-X-KEY data to an initialized DRM object.
Parameters:
key: m3u8 key system (EXT-X-KEY) object.
session: Optional session used to request AES-128 URIs.
Useful to set headers, proxies, cookies, and so forth.
Raises a NotImplementedError if the key system is not supported.
"""
if not isinstance(session, (Session, CurlSession, type(None))):
raise TypeError(f"Expected session to be a {Session} or {CurlSession}, not {type(session)}")
if not session:
session = Session()
# TODO: Add support for 'SAMPLE-AES', 'AES-CTR', 'AES-CBC', 'ClearKey'
if key.method == "AES-128":
drm = ClearKey.from_m3u_key(key, session)
elif key.method == "ISO-23001-7":
drm = Widevine(pssh=WV_PSSH.new(key_ids=[key.uri.split(",")[-1]], system_id=WV_PSSH.SystemId.Widevine))
elif key.keyformat and key.keyformat.lower() == WidevineCdm.urn:
drm = Widevine(
pssh=WV_PSSH(key.uri.split(",")[-1]),
**key._extra_params, # noqa
)
elif key.keyformat and (
key.keyformat.lower() == PlayReadyCdm or "com.microsoft.playready" in key.keyformat.lower()
):
drm = PlayReady(
pssh=PR_PSSH(key.uri.split(",")[-1]),
pssh_b64=key.uri.split(",")[-1],
)
else:
raise NotImplementedError(f"The key system is not supported: {key}")
return drm
@staticmethod
def get_all_drm(
keys: list[Union[m3u8.model.SessionKey, m3u8.model.Key]], proxy: Optional[str] = None
) -> list[DRM_T]:
"""
Convert HLS EXT-X-KEY data to initialized DRM objects.
Parameters:
keys: m3u8 key system (EXT-X-KEY) objects.
proxy: Optional proxy string used for requesting AES-128 URIs.
Raises a NotImplementedError if none of the key systems are supported.
"""
unsupported_keys: list[m3u8.Key] = []
drm_objects: list[DRM_T] = []
if any(key.method == "NONE" for key in keys):
return []
for key in keys:
try:
drm = HLS.get_drm(key, proxy)
drm_objects.append(drm)
except NotImplementedError:
unsupported_keys.append(key)
if not drm_objects and unsupported_keys:
logging.debug(
"Ignoring unsupported key systems: %s",
", ".join([str(k.keyformat or k.method) for k in unsupported_keys]),
)
return []
return drm_objects
@staticmethod
def calculate_byte_range(m3u_range: str, fallback_offset: int = 0) -> str:
"""
Convert a HLS EXT-X-BYTERANGE value to a more traditional range value.
E.g., '1433@0' -> '0-1432', '357392@1433' -> '1433-358824'.
"""
parts = [int(x) for x in m3u_range.split("@")]
if len(parts) != 2:
parts.append(fallback_offset)
length, offset = parts
return f"{offset}-{offset + length - 1}"
__all__ = ("HLS",)

View File

@ -0,0 +1,338 @@
from __future__ import annotations
import base64
import hashlib
import html
import shutil
import urllib.parse
from functools import partial
from pathlib import Path
from typing import Any, Callable, Optional, Union
import requests
from curl_cffi.requests import Session as CurlSession
from langcodes import Language, tag_is_valid
from lxml.etree import Element
from pyplayready.system.pssh import PSSH as PR_PSSH
from pywidevine.pssh import PSSH
from requests import Session
from unshackle.core.constants import DOWNLOAD_CANCELLED, DOWNLOAD_LICENCE_ONLY, AnyTrack
from unshackle.core.drm import DRM_T, PlayReady, Widevine
from unshackle.core.events import events
from unshackle.core.tracks import Audio, Subtitle, Track, Tracks, Video
from unshackle.core.utilities import try_ensure_utf8
from unshackle.core.utils.xml import load_xml
class ISM:
def __init__(self, manifest: Element, url: str) -> None:
if manifest.tag != "SmoothStreamingMedia":
raise TypeError(f"Expected 'SmoothStreamingMedia' document, got '{manifest.tag}'")
if not url:
raise requests.URLRequired("ISM manifest URL must be provided for relative paths")
self.manifest = manifest
self.url = url
@classmethod
def from_url(cls, url: str, session: Optional[Union[Session, CurlSession]] = None, **kwargs: Any) -> "ISM":
if not url:
raise requests.URLRequired("ISM manifest URL must be provided")
if not session:
session = Session()
elif not isinstance(session, (Session, CurlSession)):
raise TypeError(f"Expected session to be a {Session} or {CurlSession}, not {session!r}")
res = session.get(url, **kwargs)
if res.url != url:
url = res.url
res.raise_for_status()
return cls(load_xml(res.content), url)
@classmethod
def from_text(cls, text: str, url: str) -> "ISM":
if not text:
raise ValueError("ISM manifest text must be provided")
if not url:
raise requests.URLRequired("ISM manifest URL must be provided for relative paths")
return cls(load_xml(text), url)
@staticmethod
def _get_drm(headers: list[Element]) -> list[DRM_T]:
drm: list[DRM_T] = []
for header in headers:
system_id = (header.get("SystemID") or header.get("SystemId") or "").lower()
data = "".join(header.itertext()).strip()
if not data:
continue
if system_id == "edef8ba9-79d6-4ace-a3c8-27dcd51d21ed":
try:
pssh = PSSH(base64.b64decode(data))
except Exception:
continue
kid = next(iter(pssh.key_ids), None)
drm.append(Widevine(pssh=pssh, kid=kid))
elif system_id == "9a04f079-9840-4286-ab92-e65be0885f95":
try:
pr_pssh = PR_PSSH(data)
except Exception:
continue
drm.append(PlayReady(pssh=pr_pssh, pssh_b64=data))
return drm
def to_tracks(self, language: Optional[Union[str, Language]] = None) -> Tracks:
tracks = Tracks()
base_url = self.url
duration = int(self.manifest.get("Duration") or 0)
drm = self._get_drm(self.manifest.xpath(".//ProtectionHeader"))
for stream_index in self.manifest.findall("StreamIndex"):
content_type = stream_index.get("Type")
if not content_type:
raise ValueError("No content type value could be found")
for ql in stream_index.findall("QualityLevel"):
codec = ql.get("FourCC")
if codec == "TTML":
codec = "STPP"
track_lang = None
lang = (stream_index.get("Language") or "").strip()
if lang and tag_is_valid(lang) and not lang.startswith("und"):
track_lang = Language.get(lang)
track_urls: list[str] = []
fragment_time = 0
fragments = stream_index.findall("c")
# Some manifests omit the first fragment in the <c> list but
# still expect a request for start time 0 which contains the
# initialization segment. If the first declared fragment is not
# at time 0, prepend the missing initialization URL.
if fragments:
first_time = int(fragments[0].get("t") or 0)
if first_time != 0:
track_urls.append(
urllib.parse.urljoin(
base_url,
stream_index.get("Url").format_map(
{
"bitrate": ql.get("Bitrate"),
"start time": "0",
}
),
)
)
for idx, frag in enumerate(fragments):
fragment_time = int(frag.get("t", fragment_time))
repeat = int(frag.get("r", 1))
duration_frag = int(frag.get("d") or 0)
if not duration_frag:
try:
next_time = int(fragments[idx + 1].get("t"))
except (IndexError, AttributeError):
next_time = duration
duration_frag = (next_time - fragment_time) / repeat
for _ in range(repeat):
track_urls.append(
urllib.parse.urljoin(
base_url,
stream_index.get("Url").format_map(
{
"bitrate": ql.get("Bitrate"),
"start time": str(fragment_time),
}
),
)
)
fragment_time += duration_frag
track_id = hashlib.md5(
f"{codec}-{track_lang}-{ql.get('Bitrate') or 0}-{ql.get('Index') or 0}".encode()
).hexdigest()
data = {
"ism": {
"manifest": self.manifest,
"stream_index": stream_index,
"quality_level": ql,
"segments": track_urls,
}
}
if content_type == "video":
try:
vcodec = Video.Codec.from_mime(codec) if codec else None
except ValueError:
vcodec = None
tracks.add(
Video(
id_=track_id,
url=self.url,
codec=vcodec,
language=track_lang or language,
is_original_lang=bool(language and track_lang and str(track_lang) == str(language)),
bitrate=ql.get("Bitrate"),
width=int(ql.get("MaxWidth") or 0) or int(stream_index.get("MaxWidth") or 0),
height=int(ql.get("MaxHeight") or 0) or int(stream_index.get("MaxHeight") or 0),
descriptor=Video.Descriptor.ISM,
drm=drm,
data=data,
)
)
elif content_type == "audio":
try:
acodec = Audio.Codec.from_mime(codec) if codec else None
except ValueError:
acodec = None
tracks.add(
Audio(
id_=track_id,
url=self.url,
codec=acodec,
language=track_lang or language,
is_original_lang=bool(language and track_lang and str(track_lang) == str(language)),
bitrate=ql.get("Bitrate"),
channels=ql.get("Channels"),
descriptor=Track.Descriptor.ISM,
drm=drm,
data=data,
)
)
else:
try:
scodec = Subtitle.Codec.from_mime(codec) if codec else None
except ValueError:
scodec = None
tracks.add(
Subtitle(
id_=track_id,
url=self.url,
codec=scodec,
language=track_lang or language,
is_original_lang=bool(language and track_lang and str(track_lang) == str(language)),
descriptor=Track.Descriptor.ISM,
drm=drm,
data=data,
)
)
return tracks
@staticmethod
def download_track(
track: AnyTrack,
save_path: Path,
save_dir: Path,
progress: partial,
session: Optional[Session] = None,
proxy: Optional[str] = None,
max_workers: Optional[int] = None,
license_widevine: Optional[Callable] = None,
*,
cdm: Optional[object] = None,
) -> None:
if not session:
session = Session()
elif not isinstance(session, Session):
raise TypeError(f"Expected session to be a {Session}, not {session!r}")
if proxy:
session.proxies.update({"all": proxy})
segments: list[str] = track.data["ism"]["segments"]
session_drm = None
if track.drm:
# Mirror HLS.download_track: pick the DRM matching the provided CDM
# (or the first available) and license it if supported.
session_drm = track.get_drm_for_cdm(cdm)
if isinstance(session_drm, (Widevine, PlayReady)):
try:
if not license_widevine:
raise ValueError("license_widevine func must be supplied to use DRM")
progress(downloaded="LICENSING")
license_widevine(session_drm)
progress(downloaded="[yellow]LICENSED")
except Exception:
DOWNLOAD_CANCELLED.set()
progress(downloaded="[red]FAILED")
raise
if DOWNLOAD_LICENCE_ONLY.is_set():
progress(downloaded="[yellow]SKIPPED")
return
progress(total=len(segments))
downloader = track.downloader
skip_merge = False
downloader_args = dict(
urls=[{"url": url} for url in segments],
output_dir=save_dir,
filename="{i:0%d}.mp4" % len(str(len(segments))),
headers=session.headers,
cookies=session.cookies,
proxy=proxy,
max_workers=max_workers,
)
if downloader.__name__ == "n_m3u8dl_re":
skip_merge = True
downloader_args.update(
{
"filename": track.id,
"track": track,
"content_keys": session_drm.content_keys if session_drm else None,
}
)
for status_update in downloader(**downloader_args):
file_downloaded = status_update.get("file_downloaded")
if file_downloaded:
events.emit(events.Types.SEGMENT_DOWNLOADED, track=track, segment=file_downloaded)
else:
downloaded = status_update.get("downloaded")
if downloaded and downloaded.endswith("/s"):
status_update["downloaded"] = f"ISM {downloaded}"
progress(**status_update)
for control_file in save_dir.glob("*.aria2__temp"):
control_file.unlink()
segments_to_merge = [x for x in sorted(save_dir.iterdir()) if x.is_file()]
if skip_merge:
shutil.move(segments_to_merge[0], save_path)
else:
with open(save_path, "wb") as f:
for segment_file in segments_to_merge:
segment_data = segment_file.read_bytes()
if (
not session_drm
and isinstance(track, Subtitle)
and track.codec not in (Subtitle.Codec.fVTT, Subtitle.Codec.fTTML)
):
segment_data = try_ensure_utf8(segment_data)
segment_data = (
segment_data.decode("utf8")
.replace("&lrm;", html.unescape("&lrm;"))
.replace("&rlm;", html.unescape("&rlm;"))
.encode("utf8")
)
f.write(segment_data)
f.flush()
segment_file.unlink()
progress(advance=1)
track.path = save_path
events.emit(events.Types.TRACK_DOWNLOADED, track=track)
if not skip_merge and session_drm:
progress(downloaded="Decrypting", completed=0, total=100)
session_drm.decrypt(save_path)
track.drm = None
events.emit(events.Types.TRACK_DECRYPTED, track=track, drm=session_drm, segment=None)
progress(downloaded="Decrypting", advance=100)
save_dir.rmdir()
progress(downloaded="Downloaded")
__all__ = ("ISM",)

View File

@ -0,0 +1,34 @@
"""Utility functions for parsing M3U8 playlists."""
from __future__ import annotations
from typing import Optional, Union
import m3u8
from curl_cffi.requests import Session as CurlSession
from requests import Session
from unshackle.core.manifests.hls import HLS
from unshackle.core.tracks import Tracks
def parse(
master: m3u8.M3U8,
language: str,
*,
session: Optional[Union[Session, CurlSession]] = None,
) -> Tracks:
"""Parse a variant playlist to ``Tracks`` with basic information, defer DRM loading."""
tracks = HLS(master, session=session).to_tracks(language)
bool(master.session_keys or HLS.parse_session_data_keys(master, session or Session()))
if True:
for t in tracks.videos + tracks.audio:
t.needs_drm_loading = True
t.session = session
return tracks
__all__ = ["parse"]

View File

@ -0,0 +1,7 @@
from .basic import Basic
from .hola import Hola
from .nordvpn import NordVPN
from .surfsharkvpn import SurfsharkVPN
from .windscribevpn import WindscribeVPN
__all__ = ("Basic", "Hola", "NordVPN", "SurfsharkVPN", "WindscribeVPN")

View File

@ -0,0 +1,54 @@
import random
import re
from typing import Optional, Union
from requests.utils import prepend_scheme_if_needed
from urllib3.util import parse_url
from unshackle.core.proxies.proxy import Proxy
class Basic(Proxy):
def __init__(self, **countries: dict[str, Union[str, list[str]]]):
"""Basic Proxy Service using Proxies specified in the config."""
self.countries = {k.lower(): v for k, v in countries.items()}
def __repr__(self) -> str:
countries = len(self.countries)
servers = len(self.countries.values())
return f"{countries} Countr{['ies', 'y'][countries == 1]} ({servers} Server{['s', ''][servers == 1]})"
def get_proxy(self, query: str) -> Optional[str]:
"""Get a proxy URI from the config."""
query = query.lower()
match = re.match(r"^([a-z]{2})(\d+)?$", query, re.IGNORECASE)
if not match:
raise ValueError(f'The query "{query}" was not recognized...')
country_code = match.group(1)
entry = match.group(2)
servers: Optional[Union[str, list[str]]] = self.countries.get(country_code)
if not servers:
return None
if isinstance(servers, str):
proxy = servers
elif entry:
try:
proxy = servers[int(entry) - 1]
except IndexError:
raise ValueError(
f'There\'s only {len(servers)} prox{"y" if len(servers) == 1 else "ies"} for "{country_code}"...'
)
else:
proxy = random.choice(servers)
proxy = prepend_scheme_if_needed(proxy, "http")
parsed_proxy = parse_url(proxy)
if not parsed_proxy.host:
raise ValueError(f"The proxy '{proxy}' is not a valid proxy URI supported by Python-Requests.")
return proxy

View File

@ -0,0 +1,60 @@
import random
import re
import subprocess
from typing import Optional
from unshackle.core import binaries
from unshackle.core.proxies.proxy import Proxy
class Hola(Proxy):
def __init__(self):
"""
Proxy Service using Hola's direct connections via the hola-proxy project.
https://github.com/Snawoot/hola-proxy
"""
self.binary = binaries.HolaProxy
if not self.binary:
raise EnvironmentError("hola-proxy executable not found but is required for the Hola proxy provider.")
self.countries = self.get_countries()
def __repr__(self) -> str:
countries = len(self.countries)
return f"{countries} Countr{['ies', 'y'][countries == 1]}"
def get_proxy(self, query: str) -> Optional[str]:
"""
Get an HTTP proxy URI for a Datacenter ('direct') or Residential ('lum') Hola server.
TODO: - Add ability to select 'lum' proxies (residential proxies).
- Return and use Proxy Authorization
"""
query = query.lower()
p = subprocess.check_output(
[self.binary, "-country", query, "-list-proxies"], stderr=subprocess.STDOUT
).decode()
if "Transaction error: temporary ban detected." in p:
raise ConnectionError("Hola banned your IP temporarily from it's services. Try change your IP.")
username, password, proxy_authorization = re.search(
r"Login: (.*)\nPassword: (.*)\nProxy-Authorization: (.*)", p
).groups()
servers = re.findall(r"(zagent.*)", p)
proxies = []
for server in servers:
host, ip_address, direct, peer, hola, trial, trial_peer, vendor = server.split(",")
proxies.append(f"http://{username}:{password}@{ip_address}:{peer}")
proxy = random.choice(proxies)
return proxy
def get_countries(self) -> list[dict[str, str]]:
"""Get a list of available Countries."""
p = subprocess.check_output([self.binary, "-list-countries"]).decode("utf8")
return [{code: name} for country in p.splitlines() for (code, name) in [country.split(" - ", maxsplit=1)]]

View File

@ -0,0 +1,128 @@
import json
import re
from typing import Optional
import requests
from unshackle.core.proxies.proxy import Proxy
class NordVPN(Proxy):
def __init__(self, username: str, password: str, server_map: Optional[dict[str, int]] = None):
"""
Proxy Service using NordVPN Service Credentials.
A username and password must be provided. These are Service Credentials, not your Login Credentials.
The Service Credentials can be found here: https://my.nordaccount.com/dashboard/nordvpn/
"""
if not username:
raise ValueError("No Username was provided to the NordVPN Proxy Service.")
if not password:
raise ValueError("No Password was provided to the NordVPN Proxy Service.")
if not re.match(r"^[a-z0-9]{48}$", username + password, re.IGNORECASE) or "@" in username:
raise ValueError(
"The Username and Password must be NordVPN Service Credentials, not your Login Credentials. "
"The Service Credentials can be found here: https://my.nordaccount.com/dashboard/nordvpn/"
)
if server_map is not None and not isinstance(server_map, dict):
raise TypeError(f"Expected server_map to be a dict mapping a region to a server ID, not '{server_map!r}'.")
self.username = username
self.password = password
self.server_map = server_map or {}
self.countries = self.get_countries()
def __repr__(self) -> str:
countries = len(self.countries)
servers = sum(x["serverCount"] for x in self.countries)
return f"{countries} Countr{['ies', 'y'][countries == 1]} ({servers} Server{['s', ''][servers == 1]})"
def get_proxy(self, query: str) -> Optional[str]:
"""
Get an HTTP(SSL) proxy URI for a NordVPN server.
HTTP proxies under port 80 were disabled on the 15th of Feb, 2021:
https://nordvpn.com/blog/removing-http-proxies
"""
query = query.lower()
if re.match(r"^[a-z]{2}\d+$", query):
# country and nordvpn server id, e.g., us1, fr1234
hostname = f"{query}.nordvpn.com"
else:
if query.isdigit():
# country id
country = self.get_country(by_id=int(query))
elif re.match(r"^[a-z]+$", query):
# country code
country = self.get_country(by_code=query)
else:
raise ValueError(f"The query provided is unsupported and unrecognized: {query}")
if not country:
# NordVPN doesnt have servers in this region
return
server_mapping = self.server_map.get(country["code"].lower())
if server_mapping:
# country was set to a specific server ID in config
hostname = f"{country['code'].lower()}{server_mapping}.nordvpn.com"
else:
# get the recommended server ID
recommended_servers = self.get_recommended_servers(country["id"])
if not recommended_servers:
raise ValueError(
f"The NordVPN Country {query} currently has no recommended servers. "
"Try again later. If the issue persists, double-check the query."
)
hostname = recommended_servers[0]["hostname"]
if hostname.startswith("gb"):
# NordVPN uses the alpha2 of 'GB' in API responses, but 'UK' in the hostname
hostname = f"gb{hostname[2:]}"
return f"https://{self.username}:{self.password}@{hostname}:89"
def get_country(self, by_id: Optional[int] = None, by_code: Optional[str] = None) -> Optional[dict]:
"""Search for a Country and it's metadata."""
if all(x is None for x in (by_id, by_code)):
raise ValueError("At least one search query must be made.")
for country in self.countries:
if all(
[by_id is None or country["id"] == int(by_id), by_code is None or country["code"] == by_code.upper()]
):
return country
@staticmethod
def get_recommended_servers(country_id: int) -> list[dict]:
"""
Get the list of recommended Servers for a Country.
Note: There may not always be more than one recommended server.
"""
res = requests.get(
url="https://api.nordvpn.com/v1/servers/recommendations", params={"filters[country_id]": country_id}
)
if not res.ok:
raise ValueError(f"Failed to get a list of NordVPN countries [{res.status_code}]")
try:
return res.json()
except json.JSONDecodeError:
raise ValueError("Could not decode list of NordVPN countries, not JSON data.")
@staticmethod
def get_countries() -> list[dict]:
"""Get a list of available Countries and their metadata."""
res = requests.get(
url="https://api.nordvpn.com/v1/servers/countries",
)
if not res.ok:
raise ValueError(f"Failed to get a list of NordVPN countries [{res.status_code}]")
try:
return res.json()
except json.JSONDecodeError:
raise ValueError("Could not decode list of NordVPN countries, not JSON data.")

View File

@ -0,0 +1,31 @@
from abc import abstractmethod
from typing import Optional
class Proxy:
@abstractmethod
def __init__(self, **kwargs):
"""
The constructor initializes the Service using passed configuration data.
Any authorization or pre-fetching of data should be done here.
"""
@abstractmethod
def __repr__(self) -> str:
"""Return a string denoting a list of Countries and Servers (if possible)."""
countries = ...
servers = ...
return f"{countries} Countr{['ies', 'y'][countries == 1]} ({servers} Server{['s', ''][servers == 1]})"
@abstractmethod
def get_proxy(self, query: str) -> Optional[str]:
"""
Get a Proxy URI from the Proxy Service.
Only return None if the query was accepted, but no proxy could be returned.
Otherwise, please use exceptions to denote any errors with the call or query.
The returned Proxy URI must be a string supported by Python-Requests:
'{scheme}://[{user}:{pass}@]{host}:{port}'
"""

View File

@ -0,0 +1,124 @@
import json
import random
import re
from typing import Optional
import requests
from unshackle.core.proxies.proxy import Proxy
class SurfsharkVPN(Proxy):
def __init__(self, username: str, password: str, server_map: Optional[dict[str, int]] = None):
"""
Proxy Service using SurfsharkVPN Service Credentials.
A username and password must be provided. These are Service Credentials, not your Login Credentials.
The Service Credentials can be found here: https://my.surfshark.com/vpn/manual-setup/main/openvpn
"""
if not username:
raise ValueError("No Username was provided to the SurfsharkVPN Proxy Service.")
if not password:
raise ValueError("No Password was provided to the SurfsharkVPN Proxy Service.")
if not re.match(r"^[a-z0-9]{48}$", username + password, re.IGNORECASE) or "@" in username:
raise ValueError(
"The Username and Password must be SurfsharkVPN Service Credentials, not your Login Credentials. "
"The Service Credentials can be found here: https://my.surfshark.com/vpn/manual-setup/main/openvpn"
)
if server_map is not None and not isinstance(server_map, dict):
raise TypeError(f"Expected server_map to be a dict mapping a region to a server ID, not '{server_map!r}'.")
self.username = username
self.password = password
self.server_map = server_map or {}
self.countries = self.get_countries()
def __repr__(self) -> str:
countries = len(set(x.get("country") for x in self.countries if x.get("country")))
servers = sum(1 for x in self.countries if x.get("connectionName"))
return f"{countries} Countr{['ies', 'y'][countries == 1]} ({servers} Server{['s', ''][servers == 1]})"
def get_proxy(self, query: str) -> Optional[str]:
"""
Get an HTTP(SSL) proxy URI for a SurfsharkVPN server.
"""
query = query.lower()
if re.match(r"^[a-z]{2}\d+$", query):
# country and surfsharkvpn server id, e.g., au-per, be-anr, us-bos
hostname = f"{query}.prod.surfshark.com"
else:
if query.isdigit():
# country id
country = self.get_country(by_id=int(query))
elif re.match(r"^[a-z]+$", query):
# country code
country = self.get_country(by_code=query)
else:
raise ValueError(f"The query provided is unsupported and unrecognized: {query}")
if not country:
# SurfsharkVPN doesnt have servers in this region
return
server_mapping = self.server_map.get(country["countryCode"].lower())
if server_mapping:
# country was set to a specific server ID in config
hostname = f"{country['code'].lower()}{server_mapping}.prod.surfshark.com"
else:
# get the random server ID
random_server = self.get_random_server(country["countryCode"])
if not random_server:
raise ValueError(
f"The SurfsharkVPN Country {query} currently has no random servers. "
"Try again later. If the issue persists, double-check the query."
)
hostname = random_server
return f"https://{self.username}:{self.password}@{hostname}:443"
def get_country(self, by_id: Optional[int] = None, by_code: Optional[str] = None) -> Optional[dict]:
"""Search for a Country and it's metadata."""
if all(x is None for x in (by_id, by_code)):
raise ValueError("At least one search query must be made.")
for country in self.countries:
if all(
[
by_id is None or country["id"] == int(by_id),
by_code is None or country["countryCode"] == by_code.upper(),
]
):
return country
def get_random_server(self, country_id: str):
"""
Get the list of random Server for a Country.
Note: There may not always be more than one recommended server.
"""
country = [x["connectionName"] for x in self.countries if x["countryCode"].lower() == country_id.lower()]
try:
country = random.choice(country)
return country
except Exception:
raise ValueError("Could not get random countrycode from the countries list.")
@staticmethod
def get_countries() -> list[dict]:
"""Get a list of available Countries and their metadata."""
res = requests.get(
url="https://api.surfshark.com/v3/server/clusters/all",
headers={
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36",
"Content-Type": "application/json",
},
)
if not res.ok:
raise ValueError(f"Failed to get a list of SurfsharkVPN countries [{res.status_code}]")
try:
return res.json()
except json.JSONDecodeError:
raise ValueError("Could not decode list of SurfsharkVPN countries, not JSON data.")

View File

@ -0,0 +1,109 @@
import json
import random
import re
from typing import Optional
import requests
from unshackle.core.proxies.proxy import Proxy
class WindscribeVPN(Proxy):
def __init__(self, username: str, password: str, server_map: Optional[dict[str, str]] = None):
"""
Proxy Service using WindscribeVPN Service Credentials.
A username and password must be provided. These are Service Credentials, not your Login Credentials.
The Service Credentials can be found here: https://windscribe.com/getconfig/openvpn
"""
if not username:
raise ValueError("No Username was provided to the WindscribeVPN Proxy Service.")
if not password:
raise ValueError("No Password was provided to the WindscribeVPN Proxy Service.")
if server_map is not None and not isinstance(server_map, dict):
raise TypeError(f"Expected server_map to be a dict mapping a region to a hostname, not '{server_map!r}'.")
self.username = username
self.password = password
self.server_map = server_map or {}
self.countries = self.get_countries()
def __repr__(self) -> str:
countries = len(set(x.get("country_code") for x in self.countries if x.get("country_code")))
servers = sum(
len(host)
for location in self.countries
for group in location.get("groups", [])
for host in group.get("hosts", [])
)
return f"{countries} Countr{['ies', 'y'][countries == 1]} ({servers} Server{['s', ''][servers == 1]})"
def get_proxy(self, query: str) -> Optional[str]:
"""
Get an HTTPS proxy URI for a WindscribeVPN server.
Note: Windscribe's static OpenVPN credentials work reliably on US, AU, and NZ servers.
"""
query = query.lower()
supported_regions = {"us", "au", "nz"}
if query not in supported_regions and query not in self.server_map:
raise ValueError(
f"Windscribe proxy does not currently support the '{query.upper()}' region. "
f"Supported regions with reliable credentials: {', '.join(sorted(supported_regions))}. "
)
if query in self.server_map:
hostname = self.server_map[query]
else:
if re.match(r"^[a-z]+$", query):
hostname = self.get_random_server(query)
else:
raise ValueError(f"The query provided is unsupported and unrecognized: {query}")
if not hostname:
return None
hostname = hostname.split(':')[0]
return f"https://{self.username}:{self.password}@{hostname}:443"
def get_random_server(self, country_code: str) -> Optional[str]:
"""
Get a random server hostname for a country.
Returns None if no servers are available for the country.
"""
for location in self.countries:
if location.get("country_code", "").lower() == country_code.lower():
hostnames = []
for group in location.get("groups", []):
for host in group.get("hosts", []):
if hostname := host.get("hostname"):
hostnames.append(hostname)
if hostnames:
return random.choice(hostnames)
return None
@staticmethod
def get_countries() -> list[dict]:
"""Get a list of available Countries and their metadata."""
res = requests.get(
url="https://assets.windscribe.com/serverlist/firefox/1/1",
headers={
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36",
"Content-Type": "application/json",
},
)
if not res.ok:
raise ValueError(f"Failed to get a list of WindscribeVPN locations [{res.status_code}]")
try:
data = res.json()
return data.get("data", [])
except json.JSONDecodeError:
raise ValueError("Could not decode list of WindscribeVPN locations, not JSON data.")

View File

@ -0,0 +1,44 @@
from typing import Optional, Union
class SearchResult:
def __init__(
self,
id_: Union[str, int],
title: str,
description: Optional[str] = None,
label: Optional[str] = None,
url: Optional[str] = None,
):
"""
A Search Result for any support Title Type.
Parameters:
id_: The search result's Title ID.
title: The primary display text, e.g., the Title's Name.
description: The secondary display text, e.g., the Title's Description or
further title information.
label: The tertiary display text. This will typically be used to display
an informative label or tag to the result. E.g., "unavailable", the
title's price tag, region, etc.
url: A hyperlink to the search result or title's page.
"""
if not isinstance(id_, (str, int)):
raise TypeError(f"Expected id_ to be a {str} or {int}, not {type(id_)}")
if not isinstance(title, str):
raise TypeError(f"Expected title to be a {str}, not {type(title)}")
if not isinstance(description, (str, type(None))):
raise TypeError(f"Expected description to be a {str}, not {type(description)}")
if not isinstance(label, (str, type(None))):
raise TypeError(f"Expected label to be a {str}, not {type(label)}")
if not isinstance(url, (str, type(None))):
raise TypeError(f"Expected url to be a {str}, not {type(url)}")
self.id = id_
self.title = title
self.description = description
self.label = label
self.url = url
__all__ = ("SearchResult",)

368
unshackle/core/service.py Normal file
View File

@ -0,0 +1,368 @@
import base64
import logging
from abc import ABCMeta, abstractmethod
from collections.abc import Generator
from http.cookiejar import CookieJar
from pathlib import Path
from typing import Optional, Union
from urllib.parse import urlparse
import click
import m3u8
import requests
from requests.adapters import HTTPAdapter, Retry
from rich.padding import Padding
from rich.rule import Rule
from unshackle.core.cacher import Cacher
from unshackle.core.config import config
from unshackle.core.console import console
from unshackle.core.constants import AnyTrack
from unshackle.core.credential import Credential
from unshackle.core.drm import DRM_T
from unshackle.core.search_result import SearchResult
from unshackle.core.title_cacher import TitleCacher, get_account_hash, get_region_from_proxy
from unshackle.core.titles import Title_T, Titles_T
from unshackle.core.tracks import Chapters, Tracks
from unshackle.core.utilities import get_cached_ip_info, get_ip_info
class Service(metaclass=ABCMeta):
"""The Service Base Class."""
# Abstract class variables
ALIASES: tuple[str, ...] = () # list of aliases for the service; alternatives to the service tag.
GEOFENCE: tuple[str, ...] = () # list of ip regions required to use the service. empty list == no specific region.
def __init__(self, ctx: click.Context):
console.print(Padding(Rule(f"[rule.text]Service: {self.__class__.__name__}"), (1, 2)))
self.config = ctx.obj.config
self.log = logging.getLogger(self.__class__.__name__)
self.session = self.get_session()
self.cache = Cacher(self.__class__.__name__)
self.title_cache = TitleCacher(self.__class__.__name__)
# Store context for cache control flags and credential
self.ctx = ctx
self.credential = None # Will be set in authenticate()
self.current_region = None # Will be set based on proxy/geolocation
if not ctx.parent or not ctx.parent.params.get("no_proxy"):
if ctx.parent:
proxy = ctx.parent.params["proxy"]
else:
proxy = None
if not proxy:
# don't override the explicit proxy set by the user, even if they may be geoblocked
with console.status("Checking if current region is Geoblocked...", spinner="dots"):
if self.GEOFENCE:
# Service has geofence - need fresh IP check to determine if proxy needed
try:
current_region = get_ip_info(self.session)["country"].lower()
if any(x.lower() == current_region for x in self.GEOFENCE):
self.log.info("Service is not Geoblocked in your region")
else:
requested_proxy = self.GEOFENCE[0] # first is likely main region
self.log.info(
f"Service is Geoblocked in your region, getting a Proxy to {requested_proxy}"
)
for proxy_provider in ctx.obj.proxy_providers:
proxy = proxy_provider.get_proxy(requested_proxy)
if proxy:
self.log.info(f"Got Proxy from {proxy_provider.__class__.__name__}")
break
except Exception as e:
self.log.warning(f"Failed to check geofence: {e}")
current_region = None
else:
self.log.info("Service has no Geofence")
if proxy:
self.session.proxies.update({"all": proxy})
proxy_parse = urlparse(proxy)
if proxy_parse.username and proxy_parse.password:
self.session.headers.update(
{
"Proxy-Authorization": base64.b64encode(
f"{proxy_parse.username}:{proxy_parse.password}".encode("utf8")
).decode()
}
)
# Always verify proxy IP - proxies can change exit nodes
try:
proxy_ip_info = get_ip_info(self.session)
self.current_region = proxy_ip_info.get("country", "").lower() if proxy_ip_info else None
except Exception as e:
self.log.warning(f"Failed to verify proxy IP: {e}")
# Fallback to extracting region from proxy config
self.current_region = get_region_from_proxy(proxy)
else:
# No proxy, use cached IP info for title caching (non-critical)
try:
ip_info = get_cached_ip_info(self.session)
self.current_region = ip_info.get("country", "").lower() if ip_info else None
except Exception as e:
self.log.debug(f"Failed to get cached IP info: {e}")
self.current_region = None
# Optional Abstract functions
# The following functions may be implemented by the Service.
# Otherwise, the base service code (if any) of the function will be executed on call.
# The functions will be executed in shown order.
@staticmethod
def get_session() -> requests.Session:
"""
Creates a Python-requests Session, adds common headers
from config, cookies, retry handler, and a proxy if available.
:returns: Prepared Python-requests Session
"""
session = requests.Session()
session.headers.update(config.headers)
session.mount(
"https://",
HTTPAdapter(
max_retries=Retry(total=15, backoff_factor=0.2, status_forcelist=[429, 500, 502, 503, 504]),
pool_block=True,
),
)
session.mount("http://", session.adapters["https://"])
return session
def authenticate(self, cookies: Optional[CookieJar] = None, credential: Optional[Credential] = None) -> None:
"""
Authenticate the Service with Cookies and/or Credentials (Email/Username and Password).
This is effectively a login() function. Any API calls or object initializations
needing to be made, should be made here. This will be run before any of the
following abstract functions.
You should avoid storing or using the Credential outside this function.
Make any calls you need for any Cookies, Tokens, or such, then use those.
The Cookie jar should also not be stored outside this function. However, you may load
the Cookie jar into the service session.
"""
if cookies is not None:
if not isinstance(cookies, CookieJar):
raise TypeError(f"Expected cookies to be a {CookieJar}, not {cookies!r}.")
self.session.cookies.update(cookies)
# Store credential for cache key generation
self.credential = credential
def search(self) -> Generator[SearchResult, None, None]:
"""
Search by query for titles from the Service.
The query must be taken as a CLI argument by the Service class.
Ideally just re-use the title ID argument (i.e. self.title).
Search results will be displayed in the order yielded.
"""
raise NotImplementedError(f"Search functionality has not been implemented by {self.__class__.__name__}")
def get_widevine_service_certificate(
self, *, challenge: bytes, title: Title_T, track: AnyTrack
) -> Union[bytes, str]:
"""
Get the Widevine Service Certificate used for Privacy Mode.
:param challenge: The service challenge, providing this to a License endpoint should return the
privacy certificate that the service uses.
:param title: The current `Title` from get_titles that is being executed. This is provided in
case it has data needed to be used, e.g. for a HTTP request.
:param track: The current `Track` needing decryption. Provided for same reason as `title`.
:return: The Service Privacy Certificate as Bytes or a Base64 string. Don't Base64 Encode or
Decode the data, return as is to reduce unnecessary computations.
"""
def get_widevine_license(self, *, challenge: bytes, title: Title_T, track: AnyTrack) -> Optional[Union[bytes, str]]:
"""
Get a Widevine License message by sending a License Request (challenge).
This License message contains the encrypted Content Decryption Keys and will be
read by the Cdm and decrypted.
This is a very important request to get correct. A bad, unexpected, or missing
value in the request can cause your key to be detected and promptly banned,
revoked, disabled, or downgraded.
:param challenge: The license challenge from the Widevine CDM.
:param title: The current `Title` from get_titles that is being executed. This is provided in
case it has data needed to be used, e.g. for a HTTP request.
:param track: The current `Track` needing decryption. Provided for same reason as `title`.
:return: The License response as Bytes or a Base64 string. Don't Base64 Encode or
Decode the data, return as is to reduce unnecessary computations.
"""
# Required Abstract functions
# The following functions *must* be implemented by the Service.
# The functions will be executed in shown order.
@abstractmethod
def get_titles(self) -> Titles_T:
"""
Get Titles for the provided title ID.
Return a Movies, Series, or Album objects containing Movie, Episode, or Song title objects respectively.
The returned data must be for the given title ID, or a spawn of the title ID.
At least one object is expected to be returned, or it will presume an invalid Title ID was
provided.
You can use the `data` dictionary class instance attribute of each Title to store data you may need later on.
This can be useful to store information on each title that will be required like any sub-asset IDs, or such.
"""
def get_titles_cached(self, title_id: str = None) -> Titles_T:
"""
Cached wrapper around get_titles() to reduce redundant API calls.
This method checks the cache before calling get_titles() and handles
fallback to cached data when API calls fail.
Args:
title_id: Optional title ID for cache key generation.
If not provided, will try to extract from service instance.
Returns:
Titles object (Movies, Series, or Album)
"""
# Try to get title_id from service instance if not provided
if title_id is None:
# Different services store the title ID in different attributes
if hasattr(self, "title"):
title_id = self.title
elif hasattr(self, "title_id"):
title_id = self.title_id
else:
# If we can't determine title_id, just call get_titles directly
self.log.debug("Cannot determine title_id for caching, bypassing cache")
return self.get_titles()
# Get cache control flags from context
no_cache = False
reset_cache = False
if self.ctx and self.ctx.parent:
no_cache = self.ctx.parent.params.get("no_cache", False)
reset_cache = self.ctx.parent.params.get("reset_cache", False)
# Get account hash for cache key
account_hash = get_account_hash(self.credential)
# Use title cache to get titles with fallback support
return self.title_cache.get_cached_titles(
title_id=str(title_id),
fetch_function=self.get_titles,
region=self.current_region,
account_hash=account_hash,
no_cache=no_cache,
reset_cache=reset_cache,
)
@abstractmethod
def get_tracks(self, title: Title_T) -> Tracks:
"""
Get Track objects of the Title.
Return a Tracks object, which itself can contain Video, Audio, Subtitle or even Chapters.
Tracks.videos, Tracks.audio, Tracks.subtitles, and Track.chapters should be a List of Track objects.
Each Track in the Tracks should represent a Video/Audio Stream/Representation/Adaptation or
a Subtitle file.
While one Track should only hold information for one stream/downloadable, try to get as many
unique Track objects per stream type so Stream selection by the root code can give you more
options in terms of Resolution, Bitrate, Codecs, Language, e.t.c.
No decision making or filtering of which Tracks get returned should happen here. It can be
considered an error to filter for e.g. resolution, codec, and such. All filtering based on
arguments will be done by the root code automatically when needed.
Make sure you correctly mark which Tracks are encrypted or not, and by which DRM System
via its `drm` property.
If you are able to obtain the Track's KID (Key ID) as a 32 char (16 bit) HEX string, provide
it to the Track's `kid` variable as it will speed up the decryption process later on. It may
or may not be needed, that depends on the service. Generally if you can provide it, without
downloading any of the Track's stream data, then do.
:param title: The current `Title` from get_titles that is being executed.
:return: Tracks object containing Video, Audio, Subtitles, and Chapters, if available.
"""
@abstractmethod
def get_chapters(self, title: Title_T) -> Chapters:
"""
Get Chapters for the Title.
Parameters:
title: The current Title from `get_titles` that is being processed.
You must return a Chapters object containing 0 or more Chapter objects.
You do not need to set a Chapter number or sort/order the chapters in any way as
the Chapters class automatically handles all of that for you. If there's no
descriptive name for a Chapter then do not set a name at all.
You must not set Chapter names to "Chapter {n}" or such. If you (or the user)
wants "Chapter {n}" style Chapter names (or similar) then they can use the config
option `chapter_fallback_name`. For example, `"Chapter {i:02}"` for "Chapter 01".
"""
# Optional Event methods
def on_segment_downloaded(self, track: AnyTrack, segment: Path) -> None:
"""
Called when one of a Track's Segments has finished downloading.
Parameters:
track: The Track object that had a Segment downloaded.
segment: The Path to the Segment that was downloaded.
"""
def on_track_downloaded(self, track: AnyTrack) -> None:
"""
Called when a Track has finished downloading.
Parameters:
track: The Track object that was downloaded.
"""
def on_track_decrypted(self, track: AnyTrack, drm: DRM_T, segment: Optional[m3u8.Segment] = None) -> None:
"""
Called when a Track has finished decrypting.
Parameters:
track: The Track object that was decrypted.
drm: The DRM object it decrypted with.
segment: The HLS segment information that was decrypted.
"""
def on_track_repacked(self, track: AnyTrack) -> None:
"""
Called when a Track has finished repacking.
Parameters:
track: The Track object that was repacked.
"""
def on_track_multiplex(self, track: AnyTrack) -> None:
"""
Called when a Track is about to be Multiplexed into a Container.
Note: Right now only MKV containers are multiplexed but in the future
this may also be called when multiplexing to other containers like
MP4 via ffmpeg/mp4box.
Parameters:
track: The Track object that was repacked.
"""
__all__ = ("Service",)

View File

@ -0,0 +1,90 @@
from pathlib import Path
import click
from unshackle.core.config import config
from unshackle.core.service import Service
from unshackle.core.utilities import import_module_by_path
_service_dirs = config.directories.services
if not isinstance(_service_dirs, list):
_service_dirs = [_service_dirs]
_SERVICES = sorted(
(path for service_dir in _service_dirs for path in service_dir.glob("*/__init__.py")),
key=lambda x: x.parent.stem,
)
_MODULES = {path.parent.stem: getattr(import_module_by_path(path), path.parent.stem) for path in _SERVICES}
_ALIASES = {tag: getattr(module, "ALIASES") for tag, module in _MODULES.items()}
class Services(click.MultiCommand):
"""Lazy-loaded command group of project services."""
# Click-specific methods
def list_commands(self, ctx: click.Context) -> list[str]:
"""Returns a list of all available Services as command names for Click."""
return Services.get_tags()
def get_command(self, ctx: click.Context, name: str) -> click.Command:
"""Load the Service and return the Click CLI method."""
tag = Services.get_tag(name)
try:
service = Services.load(tag)
except KeyError as e:
available_services = self.list_commands(ctx)
if not available_services:
raise click.ClickException(
f"There are no Services added yet, therefore the '{name}' Service could not be found."
)
raise click.ClickException(f"{e}. Available Services: {', '.join(available_services)}")
if hasattr(service, "cli"):
return service.cli
raise click.ClickException(f"Service '{tag}' has no 'cli' method configured.")
# Methods intended to be used anywhere
@staticmethod
def get_tags() -> list[str]:
"""Returns a list of service tags from all available Services."""
return [x.parent.stem for x in _SERVICES]
@staticmethod
def get_path(name: str) -> Path:
"""Get the directory path of a command."""
tag = Services.get_tag(name)
for service in _SERVICES:
if service.parent.stem == tag:
return service.parent
raise KeyError(f"There is no Service added by the Tag '{name}'")
@staticmethod
def get_tag(value: str) -> str:
"""
Get the Service Tag (e.g. DSNP, not DisneyPlus/Disney+, etc.) by an Alias.
Input value can be of any case-sensitivity.
Original input value is returned if it did not match a service tag.
"""
original_value = value
value = value.lower()
for path in _SERVICES:
tag = path.parent.stem
if value in (tag.lower(), *_ALIASES.get(tag, [])):
return tag
return original_value
@staticmethod
def load(tag: str) -> Service:
"""Load a Service module by Service tag."""
module = _MODULES.get(tag)
if not module:
raise KeyError(f"There is no Service added by the Tag '{tag}'")
return module
__all__ = ("Services",)

250
unshackle/core/session.py Normal file
View File

@ -0,0 +1,250 @@
"""Session utilities for creating HTTP sessions with different backends."""
from __future__ import annotations
import logging
import random
import time
import warnings
from datetime import datetime, timezone
from email.utils import parsedate_to_datetime
from typing import Any
from urllib.parse import urlparse
from curl_cffi.requests import Response, Session, exceptions
from unshackle.core.config import config
# Globally suppress curl_cffi HTTPS proxy warnings since some proxy providers
# (like NordVPN) require HTTPS URLs but curl_cffi expects HTTP format
warnings.filterwarnings(
"ignore", message="Make sure you are using https over https proxy.*", category=RuntimeWarning, module="curl_cffi.*"
)
FINGERPRINT_PRESETS = {
"okhttp4": {
"ja3": (
"771," # TLS 1.2
"4865-4866-4867-49195-49196-52393-49199-49200-52392-49171-49172-156-157-47-53," # Ciphers
"0-23-65281-10-11-35-16-5-13-51-45-43," # Extensions
"29-23-24," # Named groups (x25519, secp256r1, secp384r1)
"0" # EC point formats
),
"akamai": "4:16777216|16711681|0|m,p,a,s",
"description": "OkHttp 3.x/4.x (BoringSSL TLS stack)",
},
"okhttp5": {
"ja3": (
"771," # TLS 1.2
"4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53," # Ciphers
"0-23-65281-10-11-35-16-5-13-51-45-43," # Extensions
"29-23-24," # Named groups (x25519, secp256r1, secp384r1)
"0" # EC point formats
),
"akamai": "4:16777216|16711681|0|m,p,a,s",
"description": "OkHttp 5.x (BoringSSL TLS stack)",
},
}
class MaxRetriesError(exceptions.RequestException):
def __init__(self, message, cause=None):
super().__init__(message)
self.__cause__ = cause
class CurlSession(Session):
def __init__(
self,
max_retries: int = 10,
backoff_factor: float = 0.2,
max_backoff: float = 60.0,
status_forcelist: list[int] | None = None,
allowed_methods: set[str] | None = None,
catch_exceptions: tuple[type[Exception], ...] | None = None,
**session_kwargs: Any,
):
super().__init__(**session_kwargs)
self.max_retries = max_retries
self.backoff_factor = backoff_factor
self.max_backoff = max_backoff
self.status_forcelist = status_forcelist or [429, 500, 502, 503, 504]
self.allowed_methods = allowed_methods or {"GET", "POST", "HEAD", "OPTIONS", "PUT", "DELETE", "TRACE"}
self.catch_exceptions = catch_exceptions or (
exceptions.ConnectionError,
exceptions.ProxyError,
exceptions.SSLError,
exceptions.Timeout,
)
self.log = logging.getLogger(self.__class__.__name__)
def get_sleep_time(self, response: Response | None, attempt: int) -> float | None:
if response:
retry_after = response.headers.get("Retry-After")
if retry_after:
try:
return float(retry_after)
except ValueError:
if retry_date := parsedate_to_datetime(retry_after):
return (retry_date - datetime.now(timezone.utc)).total_seconds()
if attempt == 0:
return 0.0
backoff_value = self.backoff_factor * (2 ** (attempt - 1))
jitter = backoff_value * 0.1
sleep_time = backoff_value + random.uniform(-jitter, jitter)
return min(sleep_time, self.max_backoff)
def request(self, method: str, url: str, **kwargs: Any) -> Response:
if method.upper() not in self.allowed_methods:
return super().request(method, url, **kwargs)
last_exception = None
response = None
for attempt in range(self.max_retries + 1):
try:
response = super().request(method, url, **kwargs)
if response.status_code not in self.status_forcelist:
return response
last_exception = exceptions.HTTPError(f"Received status code: {response.status_code}")
self.log.warning(
f"{response.status_code} {response.reason}({urlparse(url).path}). Retrying... "
f"({attempt + 1}/{self.max_retries})"
)
except self.catch_exceptions as e:
last_exception = e
response = None
self.log.warning(
f"{e.__class__.__name__}({urlparse(url).path}). Retrying... ({attempt + 1}/{self.max_retries})"
)
if attempt < self.max_retries:
if sleep_duration := self.get_sleep_time(response, attempt + 1):
if sleep_duration > 0:
time.sleep(sleep_duration)
else:
break
raise MaxRetriesError(f"Max retries exceeded for {method} {url}", cause=last_exception)
def session(
browser: str | None = None,
ja3: str | None = None,
akamai: str | None = None,
extra_fp: dict | None = None,
**kwargs,
) -> CurlSession:
"""
Create a curl_cffi session that impersonates a browser or custom TLS/HTTP fingerprint.
This is a full replacement for requests.Session with browser impersonation
and anti-bot capabilities. The session uses curl-impersonate under the hood
to mimic real browser behavior.
Args:
browser: Browser to impersonate (e.g. "chrome124", "firefox", "safari") OR
fingerprint preset name (e.g. "okhttp4").
Uses the configured default from curl_impersonate.browser if not specified.
Available presets: okhttp4
See https://github.com/lexiforest/curl_cffi#sessions for browser options.
ja3: Custom JA3 TLS fingerprint string (format: "SSLVersion,Ciphers,Extensions,Curves,PointFormats").
When provided, curl_cffi will use this exact TLS fingerprint instead of the browser's default.
See https://curl-cffi.readthedocs.io/en/latest/impersonate/customize.html
akamai: Custom Akamai HTTP/2 fingerprint string (format: "SETTINGS|WINDOW_UPDATE|PRIORITY|PSEUDO_HEADERS").
When provided, curl_cffi will use this exact HTTP/2 fingerprint instead of the browser's default.
See https://curl-cffi.readthedocs.io/en/latest/impersonate/customize.html
extra_fp: Additional fingerprint parameters dict for advanced customization.
See https://curl-cffi.readthedocs.io/en/latest/impersonate/customize.html
**kwargs: Additional arguments passed to CurlSession constructor:
- headers: Additional headers (dict)
- cookies: Cookie jar or dict
- auth: HTTP basic auth tuple (username, password)
- proxies: Proxy configuration dict
- verify: SSL certificate verification (bool, default True)
- timeout: Request timeout in seconds (float or tuple)
- allow_redirects: Follow redirects (bool, default True)
- max_redirects: Maximum redirect count (int)
- cert: Client certificate (str or tuple)
Extra arguments for retry handler:
- max_retries: Maximum number of retries (int, default 10)
- backoff_factor: Backoff factor (float, default 0.2)
- max_backoff: Maximum backoff time (float, default 60.0)
- status_forcelist: List of status codes to force retry (list, default [429, 500, 502, 503, 504])
- allowed_methods: List of allowed HTTP methods (set, default {"GET", "POST", "HEAD", "OPTIONS", "PUT", "DELETE", "TRACE"})
- catch_exceptions: List of exceptions to catch (tuple, default (exceptions.ConnectionError, exceptions.ProxyError, exceptions.SSLError, exceptions.Timeout))
Returns:
curl_cffi.requests.Session configured with browser impersonation or custom fingerprints,
common headers, and equivalent retry behavior to requests.Session.
Examples:
# Standard browser impersonation
from unshackle.core.session import session
class MyService(Service):
@staticmethod
def get_session():
return session() # Uses config default browser
# Use OkHttp 4.x preset for Android TV
class AndroidService(Service):
@staticmethod
def get_session():
return session("okhttp4")
# Custom fingerprint (manual)
class CustomService(Service):
@staticmethod
def get_session():
return session(
ja3="771,4865-4866-4867-49195...",
akamai="1:65536;2:0;4:6291456;6:262144|15663105|0|m,a,s,p",
)
# With retry configuration
class MyService(Service):
@staticmethod
def get_session():
return session(
"okhttp4",
max_retries=5,
status_forcelist=[429, 500],
allowed_methods={"GET", "HEAD", "OPTIONS"},
)
"""
if browser and browser in FINGERPRINT_PRESETS:
preset = FINGERPRINT_PRESETS[browser]
if ja3 is None:
ja3 = preset.get("ja3")
if akamai is None:
akamai = preset.get("akamai")
if extra_fp is None:
extra_fp = preset.get("extra_fp")
browser = None
if browser is None and ja3 is None and akamai is None:
browser = config.curl_impersonate.get("browser", "chrome")
session_config = {}
if browser:
session_config["impersonate"] = browser
if ja3:
session_config["ja3"] = ja3
if akamai:
session_config["akamai"] = akamai
if extra_fp:
session_config["extra_fp"] = extra_fp
session_config.update(kwargs)
session_obj = CurlSession(**session_config)
session_obj.headers.update(config.headers)
return session_obj

View File

@ -0,0 +1,401 @@
from __future__ import annotations
import hashlib
import logging
from datetime import datetime, timedelta
from typing import Optional
from unshackle.core.cacher import Cacher
from unshackle.core.config import config
from unshackle.core.titles import Titles_T
class TitleCacher:
"""
Handles caching of Title objects to reduce redundant API calls.
This wrapper provides:
- Region-aware caching to handle geo-restricted content
- Automatic fallback to cached data when API calls fail
- Cache lifetime extension during failures
- Cache hit/miss statistics for debugging
"""
def __init__(self, service_name: str):
self.service_name = service_name
self.log = logging.getLogger(f"{service_name}.TitleCache")
self.cacher = Cacher(service_name)
self.stats = {"hits": 0, "misses": 0, "fallbacks": 0}
def _generate_cache_key(
self, title_id: str, region: Optional[str] = None, account_hash: Optional[str] = None
) -> str:
"""
Generate a unique cache key for title data.
Args:
title_id: The title identifier
region: The region/proxy identifier
account_hash: Hash of account credentials (if applicable)
Returns:
A unique cache key string
"""
# Hash the title_id to handle complex IDs (URLs, dots, special chars)
# This ensures consistent length and filesystem-safe keys
title_hash = hashlib.sha256(title_id.encode()).hexdigest()[:16]
# Start with base key using hash
key_parts = ["titles", title_hash]
# Add region if available
if region:
key_parts.append(region.lower())
# Add account hash if available
if account_hash:
key_parts.append(account_hash[:8]) # Use first 8 chars of hash
# Join with underscores
cache_key = "_".join(key_parts)
# Log the mapping for debugging
self.log.debug(f"Cache key mapping: {title_id} -> {cache_key}")
return cache_key
def get_cached_titles(
self,
title_id: str,
fetch_function,
region: Optional[str] = None,
account_hash: Optional[str] = None,
no_cache: bool = False,
reset_cache: bool = False,
) -> Optional[Titles_T]:
"""
Get titles from cache or fetch from API with fallback support.
Args:
title_id: The title identifier
fetch_function: Function to call to fetch fresh titles
region: The region/proxy identifier
account_hash: Hash of account credentials
no_cache: Bypass cache completely
reset_cache: Clear cache before fetching
Returns:
Titles object (Movies, Series, or Album)
"""
# If caching is globally disabled or no_cache flag is set
if not config.title_cache_enabled or no_cache:
self.log.debug("Cache bypassed, fetching fresh titles")
return fetch_function()
# Generate cache key
cache_key = self._generate_cache_key(title_id, region, account_hash)
# If reset_cache flag is set, clear the cache entry
if reset_cache:
self.log.info(f"Clearing cache for {cache_key}")
cache_path = (config.directories.cache / self.service_name / cache_key).with_suffix(".json")
if cache_path.exists():
cache_path.unlink()
# Try to get from cache
cache = self.cacher.get(cache_key, version=1)
# Check if we have valid cached data
if cache and not cache.expired:
self.stats["hits"] += 1
self.log.debug(f"Cache hit for {title_id} (hits: {self.stats['hits']}, misses: {self.stats['misses']})")
return cache.data
# Cache miss or expired, try to fetch fresh data
self.stats["misses"] += 1
self.log.debug(f"Cache miss for {title_id}, fetching fresh data")
try:
# Attempt to fetch fresh titles
titles = fetch_function()
if titles:
# Successfully fetched, update cache
self.log.debug(f"Successfully fetched titles for {title_id}, updating cache")
cache = self.cacher.get(cache_key, version=1)
cache.set(titles, expiration=datetime.now() + timedelta(seconds=config.title_cache_time))
return titles
except Exception as e:
# API call failed, check if we have fallback cached data
if cache and cache.data:
# We have expired cached data, use it as fallback
current_time = datetime.now()
max_retention_time = cache.expiration + timedelta(
seconds=config.title_cache_max_retention - config.title_cache_time
)
if current_time < max_retention_time:
self.stats["fallbacks"] += 1
self.log.warning(
f"API call failed for {title_id}, using cached data as fallback "
f"(fallbacks: {self.stats['fallbacks']})"
)
self.log.debug(f"Error was: {e}")
# Extend cache lifetime
extended_expiration = current_time + timedelta(minutes=5)
if extended_expiration < max_retention_time:
cache.expiration = extended_expiration
cache.set(cache.data, expiration=extended_expiration)
return cache.data
else:
self.log.error(f"API call failed and cached data for {title_id} exceeded maximum retention time")
# Re-raise the exception if no fallback available
raise
def clear_all_title_cache(self):
"""Clear all title caches for this service."""
cache_dir = config.directories.cache / self.service_name
if cache_dir.exists():
for cache_file in cache_dir.glob("titles_*.json"):
cache_file.unlink()
self.log.info(f"Cleared cache file: {cache_file.name}")
def get_cache_stats(self) -> dict:
"""Get cache statistics."""
total = sum(self.stats.values())
if total > 0:
hit_rate = (self.stats["hits"] / total) * 100
else:
hit_rate = 0
return {
"hits": self.stats["hits"],
"misses": self.stats["misses"],
"fallbacks": self.stats["fallbacks"],
"hit_rate": f"{hit_rate:.1f}%",
}
def get_cached_tmdb(
self, title_id: str, kind: str, region: Optional[str] = None, account_hash: Optional[str] = None
) -> Optional[dict]:
"""
Get cached TMDB data for a title.
Args:
title_id: The title identifier
kind: "movie" or "tv"
region: The region/proxy identifier
account_hash: Hash of account credentials
Returns:
Dict with 'detail' and 'external_ids' if cached and valid, None otherwise
"""
if not config.title_cache_enabled:
return None
cache_key = self._generate_cache_key(title_id, region, account_hash)
cache = self.cacher.get(cache_key, version=1)
if not cache or not cache.data:
return None
tmdb_data = getattr(cache.data, "tmdb_data", None)
if not tmdb_data:
return None
tmdb_expiration = tmdb_data.get("expires_at")
if not tmdb_expiration or datetime.now() >= tmdb_expiration:
self.log.debug(f"TMDB cache expired for {title_id}")
return None
if tmdb_data.get("kind") != kind:
self.log.debug(f"TMDB cache kind mismatch for {title_id}: cached {tmdb_data.get('kind')}, requested {kind}")
return None
self.log.debug(f"TMDB cache hit for {title_id}")
return {
"detail": tmdb_data.get("detail"),
"external_ids": tmdb_data.get("external_ids"),
"fetched_at": tmdb_data.get("fetched_at"),
}
def cache_tmdb(
self,
title_id: str,
detail_response: dict,
external_ids_response: dict,
kind: str,
region: Optional[str] = None,
account_hash: Optional[str] = None,
) -> None:
"""
Cache TMDB data for a title.
Args:
title_id: The title identifier
detail_response: Full TMDB detail API response
external_ids_response: Full TMDB external_ids API response
kind: "movie" or "tv"
region: The region/proxy identifier
account_hash: Hash of account credentials
"""
if not config.title_cache_enabled:
return
cache_key = self._generate_cache_key(title_id, region, account_hash)
cache = self.cacher.get(cache_key, version=1)
if not cache or not cache.data:
self.log.debug(f"Cannot cache TMDB data: no title cache exists for {title_id}")
return
now = datetime.now()
tmdb_data = {
"detail": detail_response,
"external_ids": external_ids_response,
"kind": kind,
"fetched_at": now,
"expires_at": now + timedelta(days=7), # 7-day expiration
}
cache.data.tmdb_data = tmdb_data
cache.set(cache.data, expiration=cache.expiration)
self.log.debug(f"Cached TMDB data for {title_id} (kind={kind})")
def get_cached_simkl(
self, title_id: str, region: Optional[str] = None, account_hash: Optional[str] = None
) -> Optional[dict]:
"""
Get cached Simkl data for a title.
Args:
title_id: The title identifier
region: The region/proxy identifier
account_hash: Hash of account credentials
Returns:
Simkl response dict if cached and valid, None otherwise
"""
if not config.title_cache_enabled:
return None
cache_key = self._generate_cache_key(title_id, region, account_hash)
cache = self.cacher.get(cache_key, version=1)
if not cache or not cache.data:
return None
simkl_data = getattr(cache.data, "simkl_data", None)
if not simkl_data:
return None
simkl_expiration = simkl_data.get("expires_at")
if not simkl_expiration or datetime.now() >= simkl_expiration:
self.log.debug(f"Simkl cache expired for {title_id}")
return None
self.log.debug(f"Simkl cache hit for {title_id}")
return simkl_data.get("response")
def cache_simkl(
self,
title_id: str,
simkl_response: dict,
region: Optional[str] = None,
account_hash: Optional[str] = None,
) -> None:
"""
Cache Simkl data for a title.
Args:
title_id: The title identifier
simkl_response: Full Simkl API response
region: The region/proxy identifier
account_hash: Hash of account credentials
"""
if not config.title_cache_enabled:
return
cache_key = self._generate_cache_key(title_id, region, account_hash)
cache = self.cacher.get(cache_key, version=1)
if not cache or not cache.data:
self.log.debug(f"Cannot cache Simkl data: no title cache exists for {title_id}")
return
now = datetime.now()
simkl_data = {
"response": simkl_response,
"fetched_at": now,
"expires_at": now + timedelta(days=7),
}
cache.data.simkl_data = simkl_data
cache.set(cache.data, expiration=cache.expiration)
self.log.debug(f"Cached Simkl data for {title_id}")
def get_region_from_proxy(proxy_url: Optional[str]) -> Optional[str]:
"""
Extract region identifier from proxy URL.
Args:
proxy_url: The proxy URL string
Returns:
Region identifier or None
"""
if not proxy_url:
return None
# Try to extract region from common proxy patterns
# e.g., "us123.nordvpn.com", "gb-proxy.example.com"
import re
# Pattern for NordVPN style
nord_match = re.search(r"([a-z]{2})\d+\.nordvpn", proxy_url.lower())
if nord_match:
return nord_match.group(1)
# Pattern for country code at start
cc_match = re.search(r"([a-z]{2})[-_]", proxy_url.lower())
if cc_match:
return cc_match.group(1)
# Pattern for country code subdomain
subdomain_match = re.search(r"://([a-z]{2})\.", proxy_url.lower())
if subdomain_match:
return subdomain_match.group(1)
return None
def get_account_hash(credential) -> Optional[str]:
"""
Generate a hash for account identification.
Args:
credential: Credential object
Returns:
SHA1 hash of the credential or None
"""
if not credential:
return None
# Use existing sha1 property if available
if hasattr(credential, "sha1"):
return credential.sha1
# Otherwise generate hash from username
if hasattr(credential, "username"):
return hashlib.sha1(credential.username.encode()).hexdigest()
return None

View File

@ -0,0 +1,11 @@
from typing import Union
from .episode import Episode, Series
from .movie import Movie, Movies
from .song import Album, Song
Title_T = Union[Movie, Episode, Song]
Titles_T = Union[Movies, Series, Album]
__all__ = ("Episode", "Series", "Movie", "Movies", "Album", "Song", "Title_T", "Titles_T")

View File

@ -0,0 +1,246 @@
import re
from abc import ABC
from collections import Counter
from typing import Any, Iterable, Optional, Union
from langcodes import Language
from pymediainfo import MediaInfo
from rich.tree import Tree
from sortedcontainers import SortedKeyList
from unshackle.core.config import config
from unshackle.core.constants import AUDIO_CODEC_MAP, DYNAMIC_RANGE_MAP, VIDEO_CODEC_MAP
from unshackle.core.titles.title import Title
from unshackle.core.utilities import sanitize_filename
class Episode(Title):
def __init__(
self,
id_: Any,
service: type,
title: str,
season: Union[int, str],
number: Union[int, str],
name: Optional[str] = None,
year: Optional[Union[int, str]] = None,
language: Optional[Union[str, Language]] = None,
data: Optional[Any] = None,
description: Optional[str] = None,
) -> None:
super().__init__(id_, service, language, data)
if not title:
raise ValueError("Episode title must be provided")
if not isinstance(title, str):
raise TypeError(f"Expected title to be a str, not {title!r}")
if season != 0 and not season:
raise ValueError("Episode season must be provided")
if isinstance(season, str) and season.isdigit():
season = int(season)
elif not isinstance(season, int):
raise TypeError(f"Expected season to be an int, not {season!r}")
if number != 0 and not number:
raise ValueError("Episode number must be provided")
if isinstance(number, str) and number.isdigit():
number = int(number)
elif not isinstance(number, int):
raise TypeError(f"Expected number to be an int, not {number!r}")
if name is not None and not isinstance(name, str):
raise TypeError(f"Expected name to be a str, not {name!r}")
if year is not None:
if isinstance(year, str) and year.isdigit():
year = int(year)
elif not isinstance(year, int):
raise TypeError(f"Expected year to be an int, not {year!r}")
title = title.strip()
if name is not None:
name = name.strip()
# ignore episode names that are the episode number or title name
if re.match(r"Episode ?#?\d+", name, re.IGNORECASE):
name = None
elif name.lower() == title.lower():
name = None
if year is not None and year <= 0:
raise ValueError(f"Episode year cannot be {year}")
self.title = title
self.season = season
self.number = number
self.name = name
self.year = year
self.description = description
def __str__(self) -> str:
return "{title}{year} S{season:02}E{number:02} {name}".format(
title=self.title,
year=f" {self.year}" if self.year and config.series_year else "",
season=self.season,
number=self.number,
name=self.name or "",
).strip()
def get_filename(self, media_info: MediaInfo, folder: bool = False, show_service: bool = True) -> str:
primary_video_track = next(iter(media_info.video_tracks), None)
primary_audio_track = None
if media_info.audio_tracks:
sorted_audio = sorted(
media_info.audio_tracks,
key=lambda x: (
float(x.bit_rate) if x.bit_rate else 0,
bool(x.format_additionalfeatures and "JOC" in x.format_additionalfeatures),
),
reverse=True,
)
primary_audio_track = sorted_audio[0]
unique_audio_languages = len({x.language.split("-")[0] for x in media_info.audio_tracks if x.language})
# Title [Year] SXXEXX Name (or Title [Year] SXX if folder)
if folder:
name = f"{self.title}"
if self.year and config.series_year:
name += f" {self.year}"
name += f" S{self.season:02}"
else:
name = "{title}{year} S{season:02}E{number:02} {name}".format(
title=self.title.replace("$", "S"), # e.g., Arli$$
year=f" {self.year}" if self.year and config.series_year else "",
season=self.season,
number=self.number,
name=self.name or "",
).strip()
if config.scene_naming:
# Resolution
if primary_video_track:
resolution = primary_video_track.height
aspect_ratio = [
int(float(plane)) for plane in primary_video_track.other_display_aspect_ratio[0].split(":")
]
if len(aspect_ratio) == 1:
# e.g., aspect ratio of 2 (2.00:1) would end up as `(2.0,)`, add 1
aspect_ratio.append(1)
if aspect_ratio[0] / aspect_ratio[1] not in (16 / 9, 4 / 3):
# We want the resolution represented in a 4:3 or 16:9 canvas.
# If it's not 4:3 or 16:9, calculate as if it's inside a 16:9 canvas,
# otherwise the track's height value is fine.
# We are assuming this title is some weird aspect ratio so most
# likely a movie or HD source, so it's most likely widescreen so
# 16:9 canvas makes the most sense.
resolution = int(primary_video_track.width * (9 / 16))
name += f" {resolution}p"
# Service
if show_service:
name += f" {self.service.__name__}"
# 'WEB-DL'
name += " WEB-DL"
# DUAL
if unique_audio_languages == 2:
name += " DUAL"
# MULTi
if unique_audio_languages > 2:
name += " MULTi"
# Audio Codec + Channels (+ feature)
if primary_audio_track:
codec = primary_audio_track.format
channel_layout = primary_audio_track.channel_layout or primary_audio_track.channellayout_original
if channel_layout:
channels = float(
sum({"LFE": 0.1}.get(position.upper(), 1) for position in channel_layout.split(" "))
)
else:
channel_count = primary_audio_track.channel_s or primary_audio_track.channels or 0
channels = float(channel_count)
features = primary_audio_track.format_additionalfeatures or ""
name += f" {AUDIO_CODEC_MAP.get(codec, codec)}{channels:.1f}"
if "JOC" in features or primary_audio_track.joc:
name += " Atmos"
# Video (dynamic range + hfr +) Codec
if primary_video_track:
codec = primary_video_track.format
hdr_format = primary_video_track.hdr_format_commercial
hdr_format_full = primary_video_track.hdr_format or ""
trc = (
primary_video_track.transfer_characteristics
or primary_video_track.transfer_characteristics_original
or ""
)
frame_rate = float(primary_video_track.frame_rate)
# Primary HDR format detection
if hdr_format:
if hdr_format_full.startswith("Dolby Vision"):
name += " DV"
if any(indicator in hdr_format_full for indicator in ["HDR10", "SMPTE ST 2086"]):
name += " HDR"
else:
name += f" {DYNAMIC_RANGE_MAP.get(hdr_format)} "
elif "HLG" in trc or "Hybrid Log-Gamma" in trc or "ARIB STD-B67" in trc or "arib-std-b67" in trc.lower():
name += " HLG"
elif any(indicator in trc for indicator in ["PQ", "SMPTE ST 2084", "BT.2100"]) or "smpte2084" in trc.lower() or "bt.2020-10" in trc.lower():
name += " HDR"
if frame_rate > 30:
name += " HFR"
name += f" {VIDEO_CODEC_MAP.get(codec, codec)}"
if config.tag:
name += f"-{config.tag}"
return sanitize_filename(name)
else:
# Simple naming style without technical details - use spaces instead of dots
return sanitize_filename(name, " ")
class Series(SortedKeyList, ABC):
def __init__(self, iterable: Optional[Iterable] = None):
super().__init__(iterable, key=lambda x: (x.season, x.number, x.year or 0))
def __str__(self) -> str:
if not self:
return super().__str__()
return self[0].title + (f" ({self[0].year})" if self[0].year and config.series_year else "")
def tree(self, verbose: bool = False) -> Tree:
seasons = Counter(x.season for x in self)
num_seasons = len(seasons)
sum(seasons.values())
season_breakdown = ", ".join(f"S{season}({count})" for season, count in sorted(seasons.items()))
tree = Tree(
f"{num_seasons} seasons, {season_breakdown}",
guide_style="bright_black",
)
if verbose:
for season, episodes in seasons.items():
season_tree = tree.add(
f"[bold]Season {str(season).zfill(len(str(num_seasons)))}[/]: [bright_black]{episodes} episodes",
guide_style="bright_black",
)
for episode in self:
if episode.season == season:
if episode.name:
season_tree.add(
f"[bold]{str(episode.number).zfill(len(str(episodes)))}.[/] "
f"[bright_black]{episode.name}"
)
else:
season_tree.add(f"[bright_black]Episode {str(episode.number).zfill(len(str(episodes)))}")
return tree
__all__ = ("Episode", "Series")

View File

@ -0,0 +1,180 @@
from abc import ABC
from typing import Any, Iterable, Optional, Union
from langcodes import Language
from pymediainfo import MediaInfo
from rich.tree import Tree
from sortedcontainers import SortedKeyList
from unshackle.core.config import config
from unshackle.core.constants import AUDIO_CODEC_MAP, DYNAMIC_RANGE_MAP, VIDEO_CODEC_MAP
from unshackle.core.titles.title import Title
from unshackle.core.utilities import sanitize_filename
class Movie(Title):
def __init__(
self,
id_: Any,
service: type,
name: str,
year: Optional[Union[int, str]] = None,
language: Optional[Union[str, Language]] = None,
data: Optional[Any] = None,
description: Optional[str] = None,
) -> None:
super().__init__(id_, service, language, data)
if not name:
raise ValueError("Movie name must be provided")
if not isinstance(name, str):
raise TypeError(f"Expected name to be a str, not {name!r}")
if year is not None:
if isinstance(year, str) and year.isdigit():
year = int(year)
elif not isinstance(year, int):
raise TypeError(f"Expected year to be an int, not {year!r}")
name = name.strip()
if year is not None and year <= 0:
raise ValueError(f"Movie year cannot be {year}")
self.name = name
self.year = year
self.description = description
def __str__(self) -> str:
if self.year:
return f"{self.name} ({self.year})"
return self.name
def get_filename(self, media_info: MediaInfo, folder: bool = False, show_service: bool = True) -> str:
primary_video_track = next(iter(media_info.video_tracks), None)
primary_audio_track = None
if media_info.audio_tracks:
sorted_audio = sorted(
media_info.audio_tracks,
key=lambda x: (
float(x.bit_rate) if x.bit_rate else 0,
bool(x.format_additionalfeatures and "JOC" in x.format_additionalfeatures),
),
reverse=True,
)
primary_audio_track = sorted_audio[0]
unique_audio_languages = len({x.language.split("-")[0] for x in media_info.audio_tracks if x.language})
# Name (Year)
name = str(self).replace("$", "S") # e.g., Arli$$
if config.scene_naming:
# Resolution
if primary_video_track:
resolution = primary_video_track.height
aspect_ratio = [
int(float(plane)) for plane in primary_video_track.other_display_aspect_ratio[0].split(":")
]
if len(aspect_ratio) == 1:
# e.g., aspect ratio of 2 (2.00:1) would end up as `(2.0,)`, add 1
aspect_ratio.append(1)
if aspect_ratio[0] / aspect_ratio[1] not in (16 / 9, 4 / 3):
# We want the resolution represented in a 4:3 or 16:9 canvas.
# If it's not 4:3 or 16:9, calculate as if it's inside a 16:9 canvas,
# otherwise the track's height value is fine.
# We are assuming this title is some weird aspect ratio so most
# likely a movie or HD source, so it's most likely widescreen so
# 16:9 canvas makes the most sense.
resolution = int(primary_video_track.width * (9 / 16))
name += f" {resolution}p"
# Service
if show_service:
name += f" {self.service.__name__}"
# 'WEB-DL'
name += " WEB-DL"
# DUAL
if unique_audio_languages == 2:
name += " DUAL"
# MULTi
if unique_audio_languages > 2:
name += " MULTi"
# Audio Codec + Channels (+ feature)
if primary_audio_track:
codec = primary_audio_track.format
channel_layout = primary_audio_track.channel_layout or primary_audio_track.channellayout_original
if channel_layout:
channels = float(
sum({"LFE": 0.1}.get(position.upper(), 1) for position in channel_layout.split(" "))
)
else:
channel_count = primary_audio_track.channel_s or primary_audio_track.channels or 0
channels = float(channel_count)
features = primary_audio_track.format_additionalfeatures or ""
name += f" {AUDIO_CODEC_MAP.get(codec, codec)}{channels:.1f}"
if "JOC" in features or primary_audio_track.joc:
name += " Atmos"
# Video (dynamic range + hfr +) Codec
if primary_video_track:
codec = primary_video_track.format
hdr_format = primary_video_track.hdr_format_commercial
hdr_format_full = primary_video_track.hdr_format or ""
trc = (
primary_video_track.transfer_characteristics
or primary_video_track.transfer_characteristics_original
or ""
)
frame_rate = float(primary_video_track.frame_rate)
# Primary HDR format detection
if hdr_format:
if hdr_format_full.startswith("Dolby Vision"):
name += " DV"
if any(indicator in hdr_format_full for indicator in ["HDR10", "SMPTE ST 2086"]):
name += " HDR"
else:
name += f" {DYNAMIC_RANGE_MAP.get(hdr_format)} "
elif "HLG" in trc or "Hybrid Log-Gamma" in trc or "ARIB STD-B67" in trc or "arib-std-b67" in trc.lower():
name += " HLG"
elif any(indicator in trc for indicator in ["PQ", "SMPTE ST 2084", "BT.2100"]) or "smpte2084" in trc.lower() or "bt.2020-10" in trc.lower():
name += " HDR"
if frame_rate > 30:
name += " HFR"
name += f" {VIDEO_CODEC_MAP.get(codec, codec)}"
if config.tag:
name += f"-{config.tag}"
return sanitize_filename(name)
else:
# Simple naming style without technical details - use spaces instead of dots
return sanitize_filename(name, " ")
class Movies(SortedKeyList, ABC):
def __init__(self, iterable: Optional[Iterable] = None):
super().__init__(iterable, key=lambda x: x.year or 0)
def __str__(self) -> str:
if not self:
return super().__str__()
# TODO: Assumes there's only one movie
return self[0].name + (f" ({self[0].year})" if self[0].year else "")
def tree(self, verbose: bool = False) -> Tree:
num_movies = len(self)
tree = Tree(f"{num_movies} Movie{['s', ''][num_movies == 1]}", guide_style="bright_black")
if verbose:
for movie in self:
tree.add(f"[bold]{movie.name}[/] [bright_black]({movie.year or '?'})", guide_style="bright_black")
return tree
__all__ = ("Movie", "Movies")

View File

@ -0,0 +1,144 @@
from abc import ABC
from typing import Any, Iterable, Optional, Union
from langcodes import Language
from pymediainfo import MediaInfo
from rich.tree import Tree
from sortedcontainers import SortedKeyList
from unshackle.core.config import config
from unshackle.core.constants import AUDIO_CODEC_MAP
from unshackle.core.titles.title import Title
from unshackle.core.utilities import sanitize_filename
class Song(Title):
def __init__(
self,
id_: Any,
service: type,
name: str,
artist: str,
album: str,
track: int,
disc: int,
year: int,
language: Optional[Union[str, Language]] = None,
data: Optional[Any] = None,
) -> None:
super().__init__(id_, service, language, data)
if not name:
raise ValueError("Song name must be provided")
if not isinstance(name, str):
raise TypeError(f"Expected name to be a str, not {name!r}")
if not artist:
raise ValueError("Song artist must be provided")
if not isinstance(artist, str):
raise TypeError(f"Expected artist to be a str, not {artist!r}")
if not album:
raise ValueError("Song album must be provided")
if not isinstance(album, str):
raise TypeError(f"Expected album to be a str, not {name!r}")
if not track:
raise ValueError("Song track must be provided")
if not isinstance(track, int):
raise TypeError(f"Expected track to be an int, not {track!r}")
if not disc:
raise ValueError("Song disc must be provided")
if not isinstance(disc, int):
raise TypeError(f"Expected disc to be an int, not {disc!r}")
if not year:
raise ValueError("Song year must be provided")
if not isinstance(year, int):
raise TypeError(f"Expected year to be an int, not {year!r}")
name = name.strip()
artist = artist.strip()
album = album.strip()
if track <= 0:
raise ValueError(f"Song track cannot be {track}")
if disc <= 0:
raise ValueError(f"Song disc cannot be {disc}")
if year <= 0:
raise ValueError(f"Song year cannot be {year}")
self.name = name
self.artist = artist
self.album = album
self.track = track
self.disc = disc
self.year = year
def __str__(self) -> str:
return "{artist} - {album} ({year}) / {track:02}. {name}".format(
artist=self.artist, album=self.album, year=self.year, track=self.track, name=self.name
).strip()
def get_filename(self, media_info: MediaInfo, folder: bool = False, show_service: bool = True) -> str:
audio_track = next(iter(media_info.audio_tracks), None)
codec = audio_track.format
channel_layout = audio_track.channel_layout or audio_track.channellayout_original
if channel_layout:
channels = float(sum({"LFE": 0.1}.get(position.upper(), 1) for position in channel_layout.split(" ")))
else:
channel_count = audio_track.channel_s or audio_track.channels or 0
channels = float(channel_count)
features = audio_track.format_additionalfeatures or ""
if folder:
# Artist - Album (Year)
name = str(self).split(" / ")[0]
else:
# NN. Song Name
name = str(self).split(" / ")[1]
if config.scene_naming:
# Service
if show_service:
name += f" {self.service.__name__}"
# 'WEB-DL'
name += " WEB-DL"
# Audio Codec + Channels (+ feature)
name += f" {AUDIO_CODEC_MAP.get(codec, codec)}{channels:.1f}"
if "JOC" in features or audio_track.joc:
name += " Atmos"
if config.tag:
name += f"-{config.tag}"
return sanitize_filename(name, " ")
else:
# Simple naming style without technical details
return sanitize_filename(name, " ")
class Album(SortedKeyList, ABC):
def __init__(self, iterable: Optional[Iterable] = None):
super().__init__(iterable, key=lambda x: (x.album, x.disc, x.track, x.year or 0))
def __str__(self) -> str:
if not self:
return super().__str__()
return f"{self[0].artist} - {self[0].album} ({self[0].year or '?'})"
def tree(self, verbose: bool = False) -> Tree:
num_songs = len(self)
tree = Tree(f"{num_songs} Song{['s', ''][num_songs == 1]}", guide_style="bright_black")
if verbose:
for song in self:
tree.add(f"[bold]Track {song.track:02}.[/] [bright_black]({song.name})", guide_style="bright_black")
return tree
__all__ = ("Song", "Album")

View File

@ -0,0 +1,68 @@
from __future__ import annotations
from abc import abstractmethod
from typing import Any, Optional, Union
from langcodes import Language
from pymediainfo import MediaInfo
from unshackle.core.tracks import Tracks
class Title:
def __init__(
self, id_: Any, service: type, language: Optional[Union[str, Language]] = None, data: Optional[Any] = None
) -> None:
"""
Media Title from a Service.
Parameters:
id_: An identifier for this specific title. It must be unique. Can be of any
value.
service: Service class that this title is from.
language: The original recorded language for the title. If that information
is not available, this should not be set to anything.
data: Arbitrary storage for the title. Often used to store extra metadata
information, IDs, URIs, and so on.
"""
if not id_: # includes 0, false, and similar values, this is intended
raise ValueError("A unique ID must be provided")
if hasattr(id_, "__len__") and len(id_) < 4:
raise ValueError("The unique ID is not large enough, clash likely.")
if not service:
raise ValueError("Service class must be provided")
if not isinstance(service, type):
raise TypeError(f"Expected service to be a Class (type), not {service!r}")
if language is not None:
if isinstance(language, str):
language = Language.get(language)
elif not isinstance(language, Language):
raise TypeError(f"Expected language to be a {Language} or str, not {language!r}")
self.id = id_
self.service = service
self.language = language
self.data = data
self.tracks = Tracks()
def __eq__(self, other: Title) -> bool:
return self.id == other.id
@abstractmethod
def get_filename(self, media_info: MediaInfo, folder: bool = False, show_service: bool = True) -> str:
"""
Get a Filename for this Title with the provided Media Info.
All filenames should be sanitized with the sanitize_filename() utility function.
Parameters:
media_info: MediaInfo object of the file this name will be used for.
folder: This filename will be used as a folder name. Some changes may want to
be made if this is the case.
show_service: Show the service tag (e.g., iT, NF) in the filename.
"""
__all__ = ("Title",)

View File

@ -0,0 +1,11 @@
from .attachment import Attachment
from .audio import Audio
from .chapter import Chapter
from .chapters import Chapters
from .hybrid import Hybrid
from .subtitle import Subtitle
from .track import Track
from .tracks import Tracks
from .video import Video
__all__ = ("Audio", "Attachment", "Chapter", "Chapters", "Hybrid", "Subtitle", "Track", "Tracks", "Video")

View File

@ -0,0 +1,147 @@
from __future__ import annotations
import mimetypes
import os
from pathlib import Path
from typing import Optional, Union
from urllib.parse import urlparse
from zlib import crc32
import requests
from unshackle.core.config import config
class Attachment:
def __init__(
self,
path: Union[Path, str, None] = None,
url: Optional[str] = None,
name: Optional[str] = None,
mime_type: Optional[str] = None,
description: Optional[str] = None,
session: Optional[requests.Session] = None,
):
"""
Create a new Attachment.
If providing a path, the file must already exist.
If providing a URL, the file will be downloaded to the temp directory.
Either path or url must be provided.
If name is not provided it will use the file name (without extension).
If mime_type is not provided, it will try to guess it.
Args:
path: Path to an existing file.
url: URL to download the attachment from.
name: Name of the attachment.
mime_type: MIME type of the attachment.
description: Description of the attachment.
session: Optional requests session to use for downloading.
"""
if path is None and url is None:
raise ValueError("Either path or url must be provided.")
if url:
if not isinstance(url, str):
raise ValueError("The attachment URL must be a string.")
# If a URL is provided, download the file to the temp directory
parsed_url = urlparse(url)
file_name = os.path.basename(parsed_url.path) or "attachment"
# Use provided name for the file if available
if name:
file_name = f"{name.replace(' ', '_')}{os.path.splitext(file_name)[1]}"
download_path = config.directories.temp / file_name
# Download the file
try:
session = session or requests.Session()
response = session.get(url, stream=True)
response.raise_for_status()
config.directories.temp.mkdir(parents=True, exist_ok=True)
download_path.parent.mkdir(parents=True, exist_ok=True)
with open(download_path, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
path = download_path
except Exception as e:
raise ValueError(f"Failed to download attachment from URL: {e}")
if not isinstance(path, (str, Path)):
raise ValueError("The attachment path must be provided.")
path = Path(path)
if not path.exists():
raise ValueError("The attachment file does not exist.")
name = (name or path.stem).strip()
mime_type = (mime_type or "").strip() or None
description = (description or "").strip() or None
if not mime_type:
mime_type = {
".ttf": "application/x-truetype-font",
".otf": "application/vnd.ms-opentype",
".jpg": "image/jpeg",
".jpeg": "image/jpeg",
".png": "image/png",
}.get(path.suffix.lower(), mimetypes.guess_type(path)[0])
if not mime_type:
raise ValueError("The attachment mime-type could not be automatically detected.")
self.path = path
self.name = name
self.mime_type = mime_type
self.description = description
def __repr__(self) -> str:
return "{name}({items})".format(
name=self.__class__.__name__, items=", ".join([f"{k}={repr(v)}" for k, v in self.__dict__.items()])
)
def __str__(self) -> str:
return " | ".join(filter(bool, ["ATT", self.name, self.mime_type, self.description]))
@property
def id(self) -> str:
"""Compute an ID from the attachment data."""
checksum = crc32(self.path.read_bytes())
return hex(checksum)
def delete(self) -> None:
if self.path:
self.path.unlink()
self.path = None
@classmethod
def from_url(
cls,
url: str,
name: Optional[str] = None,
mime_type: Optional[str] = None,
description: Optional[str] = None,
session: Optional[requests.Session] = None,
) -> "Attachment":
"""
Create an attachment from a URL.
Args:
url: URL to download the attachment from.
name: Name of the attachment.
mime_type: MIME type of the attachment.
description: Description of the attachment.
session: Optional requests session to use for downloading.
Returns:
Attachment: A new attachment instance.
"""
return cls(url=url, name=name, mime_type=mime_type, description=description, session=session)
__all__ = ("Attachment",)

View File

@ -0,0 +1,193 @@
from __future__ import annotations
import math
from enum import Enum
from typing import Any, Optional, Union
from unshackle.core.tracks.track import Track
class Audio(Track):
class Codec(str, Enum):
AAC = "AAC" # https://wikipedia.org/wiki/Advanced_Audio_Coding
AC3 = "DD" # https://wikipedia.org/wiki/Dolby_Digital
EC3 = "DD+" # https://wikipedia.org/wiki/Dolby_Digital_Plus
AC4 = "AC-4" # https://wikipedia.org/wiki/Dolby_AC-4
OPUS = "OPUS" # https://wikipedia.org/wiki/Opus_(audio_format)
OGG = "VORB" # https://wikipedia.org/wiki/Vorbis
DTS = "DTS" # https://en.wikipedia.org/wiki/DTS_(company)#DTS_Digital_Surround
ALAC = "ALAC" # https://en.wikipedia.org/wiki/Apple_Lossless_Audio_Codec
FLAC = "FLAC" # https://en.wikipedia.org/wiki/FLAC
@property
def extension(self) -> str:
return self.name.lower()
@staticmethod
def from_mime(mime: str) -> Audio.Codec:
mime = mime.lower().strip().split(".")[0]
if mime == "mp4a":
return Audio.Codec.AAC
if mime == "ac-3":
return Audio.Codec.AC3
if mime == "ec-3":
return Audio.Codec.EC3
if mime == "ac-4":
return Audio.Codec.AC4
if mime == "opus":
return Audio.Codec.OPUS
if mime == "dtsc":
return Audio.Codec.DTS
if mime == "alac":
return Audio.Codec.ALAC
if mime == "flac":
return Audio.Codec.FLAC
raise ValueError(f"The MIME '{mime}' is not a supported Audio Codec")
@staticmethod
def from_codecs(codecs: str) -> Audio.Codec:
for codec in codecs.lower().split(","):
mime = codec.strip().split(".")[0]
try:
return Audio.Codec.from_mime(mime)
except ValueError:
pass
raise ValueError(f"No MIME types matched any supported Audio Codecs in '{codecs}'")
@staticmethod
def from_netflix_profile(profile: str) -> Audio.Codec:
profile = profile.lower().strip()
if profile.startswith("heaac"):
return Audio.Codec.AAC
if profile.startswith("dd-"):
return Audio.Codec.AC3
if profile.startswith("ddplus"):
return Audio.Codec.EC3
if profile.startswith("ac4"):
return Audio.Codec.AC4
if profile.startswith("playready-oggvorbis"):
return Audio.Codec.OGG
raise ValueError(f"The Content Profile '{profile}' is not a supported Audio Codec")
def __init__(
self,
*args: Any,
codec: Optional[Audio.Codec] = None,
bitrate: Optional[Union[str, int, float]] = None,
channels: Optional[Union[str, int, float]] = None,
joc: Optional[int] = None,
descriptive: Union[bool, int] = False,
**kwargs: Any,
):
"""
Create a new Audio track object.
Parameters:
codec: An Audio.Codec enum representing the audio codec.
If not specified, MediaInfo will be used to retrieve the codec
once the track has been downloaded.
bitrate: A number or float representing the average bandwidth in bytes/s.
Float values are rounded up to the nearest integer.
channels: A number, float, or string representing the number of audio channels.
Strings may represent numbers or floats. Expanded layouts like 7.1.1 is
not supported. All numbers and strings will be cast to float.
joc: The number of Joint-Object-Coding Channels/Objects in the audio stream.
descriptive: Mark this audio as being descriptive audio for the blind.
Note: If codec, bitrate, channels, or joc is not specified some checks may be
skipped or assume a value. Specifying as much information as possible is highly
recommended.
"""
super().__init__(*args, **kwargs)
if not isinstance(codec, (Audio.Codec, type(None))):
raise TypeError(f"Expected codec to be a {Audio.Codec}, not {codec!r}")
if not isinstance(bitrate, (str, int, float, type(None))):
raise TypeError(f"Expected bitrate to be a {str}, {int}, or {float}, not {bitrate!r}")
if not isinstance(channels, (str, int, float, type(None))):
raise TypeError(f"Expected channels to be a {str}, {int}, or {float}, not {channels!r}")
if not isinstance(joc, (int, type(None))):
raise TypeError(f"Expected joc to be a {int}, not {joc!r}")
if not isinstance(descriptive, (bool, int)) or (isinstance(descriptive, int) and descriptive not in (0, 1)):
raise TypeError(f"Expected descriptive to be a {bool} or bool-like {int}, not {descriptive!r}")
self.codec = codec
try:
self.bitrate = int(math.ceil(float(bitrate))) if bitrate else None
except (ValueError, TypeError) as e:
raise ValueError(f"Expected bitrate to be a number or float, {e}")
try:
self.channels = self.parse_channels(channels) if channels else None
except (ValueError, NotImplementedError) as e:
raise ValueError(f"Expected channels to be a number, float, or a string, {e}")
self.joc = joc
self.descriptive = bool(descriptive)
@property
def atmos(self) -> bool:
"""Return True if the audio track contains Dolby Atmos."""
return bool(self.joc)
def __str__(self) -> str:
return " | ".join(
filter(
bool,
[
"AUD",
f"[{self.codec.value}]" if self.codec else None,
str(self.language),
", ".join(
filter(
bool,
[
str(self.channels) if self.channels else None,
"Atmos" if self.atmos else None,
f"JOC {self.joc}" if self.joc else None,
],
)
),
f"{self.bitrate // 1000} kb/s" if self.bitrate else None,
self.get_track_name(),
self.edition,
],
)
)
@staticmethod
def parse_channels(channels: Union[str, int, float]) -> float:
"""
Converts a Channel string to a float representing audio channel count and layout.
E.g. "3" -> "3.0", "2.1" -> "2.1", ".1" -> "0.1".
This does not validate channel strings as genuine channel counts or valid layouts.
It does not convert the value to assume a sub speaker channel layout, e.g. 5.1->6.0.
It also does not support expanded surround sound channel layout strings like 7.1.2.
"""
if isinstance(channels, str):
# TODO: Support all possible DASH channel configurations (https://datatracker.ietf.org/doc/html/rfc8216)
if channels.upper() == "A000":
return 2.0
elif channels.upper() == "F801":
return 5.1
elif channels.replace("ch", "").replace(".", "", 1).isdigit():
# e.g., '2ch', '2', '2.0', '5.1ch', '5.1'
return float(channels.replace("ch", ""))
raise NotImplementedError(f"Unsupported Channels string value, '{channels}'")
return float(channels)
def get_track_name(self) -> Optional[str]:
"""Return the base Track Name."""
track_name = super().get_track_name() or ""
flag = self.descriptive and "Descriptive"
if flag:
if track_name:
flag = f" ({flag})"
track_name += flag
return track_name or None
__all__ = ("Audio",)

View File

@ -0,0 +1,77 @@
from __future__ import annotations
import re
from typing import Optional, Union
from zlib import crc32
TIMESTAMP_FORMAT = re.compile(r"^(?P<hour>\d{2}):(?P<minute>\d{2}):(?P<second>\d{2})(?P<ms>.\d{3}|)$")
class Chapter:
def __init__(self, timestamp: Union[str, int, float], name: Optional[str] = None):
"""
Create a new Chapter with a Timestamp and optional name.
The timestamp may be in the following formats:
- "HH:MM:SS" string, e.g., `25:05:23`.
- "HH:MM:SS.mss" string, e.g., `25:05:23.120`.
- a timecode integer in milliseconds, e.g., `90323120` is `25:05:23.120`.
- a timecode float in seconds, e.g., `90323.12` is `25:05:23.120`.
If you have a timecode integer in seconds, just multiply it by 1000.
If you have a timecode float in milliseconds (no decimal value), just convert
it to an integer.
"""
if timestamp is None:
raise ValueError("The timestamp must be provided.")
if not isinstance(timestamp, (str, int, float)):
raise TypeError(f"Expected timestamp to be {str}, {int} or {float}, not {type(timestamp)}")
if not isinstance(name, (str, type(None))):
raise TypeError(f"Expected name to be {str}, not {type(name)}")
if not isinstance(timestamp, str):
if isinstance(timestamp, int): # ms
hours, remainder = divmod(timestamp, 1000 * 60 * 60)
minutes, remainder = divmod(remainder, 1000 * 60)
seconds, ms = divmod(remainder, 1000)
elif isinstance(timestamp, float): # seconds.ms
hours, remainder = divmod(timestamp, 60 * 60)
minutes, remainder = divmod(remainder, 60)
seconds, ms = divmod(int(remainder * 1000), 1000)
else:
raise TypeError
timestamp = f"{int(hours):02}:{int(minutes):02}:{int(seconds):02}.{str(ms).zfill(3)[:3]}"
timestamp_m = TIMESTAMP_FORMAT.match(timestamp)
if not timestamp_m:
raise ValueError(f"The timestamp format is invalid: {timestamp}")
hour, minute, second, ms = timestamp_m.groups()
if not ms:
timestamp += ".000"
self.timestamp = timestamp
self.name = name
def __repr__(self) -> str:
return "{name}({items})".format(
name=self.__class__.__name__, items=", ".join([f"{k}={repr(v)}" for k, v in self.__dict__.items()])
)
def __str__(self) -> str:
return " | ".join(filter(bool, ["CHP", self.timestamp, self.name]))
@property
def id(self) -> str:
"""Compute an ID from the Chapter data."""
checksum = crc32(str(self).encode("utf8"))
return hex(checksum)
@property
def named(self) -> bool:
"""Check if Chapter is named."""
return bool(self.name)
__all__ = ("Chapter",)

View File

@ -0,0 +1,144 @@
from __future__ import annotations
import re
from abc import ABC
from pathlib import Path
from typing import Any, Iterable, Optional, Union
from zlib import crc32
from sortedcontainers import SortedKeyList
from unshackle.core.tracks import Chapter
OGM_SIMPLE_LINE_1_FORMAT = re.compile(r"^CHAPTER(?P<number>\d+)=(?P<timestamp>\d{2,}:\d{2}:\d{2}\.\d{3})$")
OGM_SIMPLE_LINE_2_FORMAT = re.compile(r"^CHAPTER(?P<number>\d+)NAME=(?P<name>.*)$")
class Chapters(SortedKeyList, ABC):
def __init__(self, iterable: Optional[Iterable[Chapter]] = None):
super().__init__(key=lambda x: x.timestamp or 0)
for chapter in iterable or []:
self.add(chapter)
def __repr__(self) -> str:
return "{name}({items})".format(
name=self.__class__.__name__, items=", ".join([f"{k}={repr(v)}" for k, v in self.__dict__.items()])
)
def __str__(self) -> str:
return "\n".join(
[
" | ".join(filter(bool, ["CHP", f"[{i:02}]", chapter.timestamp, chapter.name]))
for i, chapter in enumerate(self, start=1)
]
)
@classmethod
def loads(cls, data: str) -> Chapters:
"""Load chapter data from a string."""
lines = [line.strip() for line in data.strip().splitlines(keepends=False)]
if len(lines) % 2 != 0:
raise ValueError("The number of chapter lines must be even.")
chapters = []
for line_1, line_2 in zip(lines[::2], lines[1::2]):
line_1_match = OGM_SIMPLE_LINE_1_FORMAT.match(line_1)
if not line_1_match:
raise SyntaxError(f"An unexpected syntax error occurred on: {line_1}")
line_2_match = OGM_SIMPLE_LINE_2_FORMAT.match(line_2)
if not line_2_match:
raise SyntaxError(f"An unexpected syntax error occurred on: {line_2}")
line_1_number, timestamp = line_1_match.groups()
line_2_number, name = line_2_match.groups()
if line_1_number != line_2_number:
raise SyntaxError(
f"The chapter numbers {line_1_number} and {line_2_number} do not match on:\n{line_1}\n{line_2}"
)
if not timestamp:
raise SyntaxError(f"The timestamp is missing on: {line_1}")
chapters.append(Chapter(timestamp, name))
return cls(chapters)
@classmethod
def load(cls, path: Union[Path, str]) -> Chapters:
"""Load chapter data from a file."""
if isinstance(path, str):
path = Path(path)
return cls.loads(path.read_text(encoding="utf8"))
def dumps(self, fallback_name: str = "") -> str:
"""
Return chapter data in OGM-based Simple Chapter format.
https://mkvtoolnix.download/doc/mkvmerge.html#mkvmerge.chapters.simple
Parameters:
fallback_name: Name used for Chapters without a Name set.
The fallback name can use the following variables in f-string style:
- {i}: The Chapter number starting at 1.
E.g., `"Chapter {i}"`: "Chapter 1", "Intro", "Chapter 3".
- {j}: A number starting at 1 that increments any time a Chapter has no name.
E.g., `"Chapter {j}"`: "Chapter 1", "Intro", "Chapter 2".
These are formatted with f-strings, directives are supported.
For example, `"Chapter {i:02}"` will result in `"Chapter 01"`.
"""
chapters = []
j = 0
for i, chapter in enumerate(self, start=1):
if not chapter.name:
j += 1
chapters.append(
"CHAPTER{num}={time}\nCHAPTER{num}NAME={name}".format(
num=f"{i:02}", time=chapter.timestamp, name=chapter.name or fallback_name.format(i=i, j=j)
)
)
return "\n".join(chapters)
def dump(self, path: Union[Path, str], *args: Any, **kwargs: Any) -> int:
"""
Write chapter data in OGM-based Simple Chapter format to a file.
Parameters:
path: The file path to write the Chapter data to, overwriting
any existing data.
See `Chapters.dumps` for more parameter documentation.
"""
if isinstance(path, str):
path = Path(path)
path.parent.mkdir(parents=True, exist_ok=True)
ogm_text = self.dumps(*args, **kwargs)
return path.write_text(ogm_text, encoding="utf8")
def add(self, value: Chapter) -> None:
if not isinstance(value, Chapter):
raise TypeError(f"Can only add {Chapter} objects, not {type(value)}")
if any(chapter.timestamp == value.timestamp for chapter in self):
raise ValueError(f"A Chapter with the Timestamp {value.timestamp} already exists")
super().add(value)
if not any(chapter.timestamp == "00:00:00.000" for chapter in self):
self.add(Chapter(0))
@property
def id(self) -> str:
"""Compute an ID from the Chapter data."""
checksum = crc32("\n".join([chapter.id for chapter in self]).encode("utf8"))
return hex(checksum)
__all__ = ("Chapters", "Chapter")

View File

@ -0,0 +1,327 @@
import json
import logging
import os
import subprocess
import sys
from pathlib import Path
from rich.padding import Padding
from rich.rule import Rule
from unshackle.core.binaries import DoviTool, HDR10PlusTool
from unshackle.core.config import config
from unshackle.core.console import console
class Hybrid:
def __init__(self, videos, source) -> None:
self.log = logging.getLogger("hybrid")
"""
Takes the Dolby Vision and HDR10(+) streams out of the VideoTracks.
It will then attempt to inject the Dolby Vision metadata layer to the HDR10(+) stream.
If no DV track is available but HDR10+ is present, it will convert HDR10+ to DV.
"""
global directories
from unshackle.core.tracks import Video
self.videos = videos
self.source = source
self.rpu_file = "RPU.bin"
self.hdr_type = "HDR10"
self.hevc_file = f"{self.hdr_type}-DV.hevc"
self.hdr10plus_to_dv = False
self.hdr10plus_file = "HDR10Plus.json"
# Get resolution info from HDR10 track for display
hdr10_track = next((v for v in videos if v.range == Video.Range.HDR10), None)
hdr10p_track = next((v for v in videos if v.range == Video.Range.HDR10P), None)
track_for_res = hdr10_track or hdr10p_track
self.resolution = f"{track_for_res.height}p" if track_for_res and track_for_res.height else "Unknown"
console.print(Padding(Rule(f"[rule.text]HDR10+DV Hybrid ({self.resolution})"), (1, 2)))
for video in self.videos:
if not video.path or not os.path.exists(video.path):
raise ValueError(f"Video track {video.id} was not downloaded before injection.")
# Check if we have DV track available
has_dv = any(video.range == Video.Range.DV for video in self.videos)
has_hdr10 = any(video.range == Video.Range.HDR10 for video in self.videos)
has_hdr10p = any(video.range == Video.Range.HDR10P for video in self.videos)
if not has_hdr10:
raise ValueError("No HDR10 track available for hybrid processing.")
# If we have HDR10+ but no DV, we can convert HDR10+ to DV
if not has_dv and has_hdr10p:
self.log.info("✓ No DV track found, but HDR10+ is available. Will convert HDR10+ to DV.")
self.hdr10plus_to_dv = True
elif not has_dv:
raise ValueError("No DV track available and no HDR10+ to convert.")
if os.path.isfile(config.directories.temp / self.hevc_file):
self.log.info("✓ Already Injected")
return
for video in videos:
# Use the actual path from the video track
save_path = video.path
if not save_path or not os.path.exists(save_path):
raise ValueError(f"Video track {video.id} was not downloaded or path not found: {save_path}")
if video.range == Video.Range.HDR10:
self.extract_stream(save_path, "HDR10")
elif video.range == Video.Range.HDR10P:
self.extract_stream(save_path, "HDR10")
self.hdr_type = "HDR10+"
elif video.range == Video.Range.DV:
self.extract_stream(save_path, "DV")
if self.hdr10plus_to_dv:
# Extract HDR10+ metadata and convert to DV
hdr10p_video = next(v for v in videos if v.range == Video.Range.HDR10P)
self.extract_hdr10plus(hdr10p_video)
self.convert_hdr10plus_to_dv()
else:
# Regular DV extraction
dv_video = next(v for v in videos if v.range == Video.Range.DV)
self.extract_rpu(dv_video)
if os.path.isfile(config.directories.temp / "RPU_UNT.bin"):
self.rpu_file = "RPU_UNT.bin"
self.level_6()
# Mode 3 conversion already done during extraction when not untouched
elif os.path.isfile(config.directories.temp / "RPU.bin"):
# RPU already extracted with mode 3
pass
self.injecting()
self.log.info("✓ Injection Completed")
if self.source == ("itunes" or "appletvplus"):
Path.unlink(config.directories.temp / "hdr10.mkv")
Path.unlink(config.directories.temp / "dv.mkv")
Path.unlink(config.directories.temp / "HDR10.hevc", missing_ok=True)
Path.unlink(config.directories.temp / "DV.hevc", missing_ok=True)
Path.unlink(config.directories.temp / f"{self.rpu_file}", missing_ok=True)
def ffmpeg_simple(self, save_path, output):
"""Simple ffmpeg execution without progress tracking"""
p = subprocess.run(
[
"ffmpeg",
"-nostdin",
"-i",
str(save_path),
"-c:v",
"copy",
str(output),
"-y", # overwrite output
],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
return p.returncode
def extract_stream(self, save_path, type_):
output = Path(config.directories.temp / f"{type_}.hevc")
with console.status(f"Extracting {type_} stream...", spinner="dots"):
returncode = self.ffmpeg_simple(save_path, output)
if returncode:
output.unlink(missing_ok=True)
self.log.error(f"x Failed extracting {type_} stream")
sys.exit(1)
self.log.info(f"Extracted {type_} stream")
def extract_rpu(self, video, untouched=False):
if os.path.isfile(config.directories.temp / "RPU.bin") or os.path.isfile(
config.directories.temp / "RPU_UNT.bin"
):
return
with console.status(
f"Extracting{' untouched ' if untouched else ' '}RPU from Dolby Vision stream...", spinner="dots"
):
extraction_args = [str(DoviTool)]
if not untouched:
extraction_args += ["-m", "3"]
extraction_args += [
"extract-rpu",
config.directories.temp / "DV.hevc",
"-o",
config.directories.temp / f"{'RPU' if not untouched else 'RPU_UNT'}.bin",
]
rpu_extraction = subprocess.run(
extraction_args,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
if rpu_extraction.returncode:
Path.unlink(config.directories.temp / f"{'RPU' if not untouched else 'RPU_UNT'}.bin")
if b"MAX_PQ_LUMINANCE" in rpu_extraction.stderr:
self.extract_rpu(video, untouched=True)
elif b"Invalid PPS index" in rpu_extraction.stderr:
raise ValueError("Dolby Vision VideoTrack seems to be corrupt")
else:
raise ValueError(f"Failed extracting{' untouched ' if untouched else ' '}RPU from Dolby Vision stream")
self.log.info(f"Extracted{' untouched ' if untouched else ' '}RPU from Dolby Vision stream")
def level_6(self):
"""Edit RPU Level 6 values"""
with open(config.directories.temp / "L6.json", "w+") as level6_file:
level6 = {
"cm_version": "V29",
"length": 0,
"level6": {
"max_display_mastering_luminance": 1000,
"min_display_mastering_luminance": 1,
"max_content_light_level": 0,
"max_frame_average_light_level": 0,
},
}
json.dump(level6, level6_file, indent=3)
if not os.path.isfile(config.directories.temp / "RPU_L6.bin"):
with console.status("Editing RPU Level 6 values...", spinner="dots"):
level6 = subprocess.run(
[
str(DoviTool),
"editor",
"-i",
config.directories.temp / self.rpu_file,
"-j",
config.directories.temp / "L6.json",
"-o",
config.directories.temp / "RPU_L6.bin",
],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
if level6.returncode:
Path.unlink(config.directories.temp / "RPU_L6.bin")
raise ValueError("Failed editing RPU Level 6 values")
self.log.info("Edited RPU Level 6 values")
# Update rpu_file to use the edited version
self.rpu_file = "RPU_L6.bin"
def injecting(self):
if os.path.isfile(config.directories.temp / self.hevc_file):
return
with console.status(f"Injecting Dolby Vision metadata into {self.hdr_type} stream...", spinner="dots"):
inject_cmd = [
str(DoviTool),
"inject-rpu",
"-i",
config.directories.temp / "HDR10.hevc",
"--rpu-in",
config.directories.temp / self.rpu_file,
]
# If we converted from HDR10+, optionally remove HDR10+ metadata during injection
# Default to removing HDR10+ metadata since we're converting to DV
if self.hdr10plus_to_dv:
inject_cmd.append("--drop-hdr10plus")
self.log.info(" - Removing HDR10+ metadata during injection")
inject_cmd.extend(["-o", config.directories.temp / self.hevc_file])
inject = subprocess.run(
inject_cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
if inject.returncode:
Path.unlink(config.directories.temp / self.hevc_file)
raise ValueError("Failed injecting Dolby Vision metadata into HDR10 stream")
self.log.info(f"Injected Dolby Vision metadata into {self.hdr_type} stream")
def extract_hdr10plus(self, _video):
"""Extract HDR10+ metadata from the video stream"""
if os.path.isfile(config.directories.temp / self.hdr10plus_file):
return
if not HDR10PlusTool:
raise ValueError("HDR10Plus_tool not found. Please install it to use HDR10+ to DV conversion.")
with console.status("Extracting HDR10+ metadata...", spinner="dots"):
# HDR10Plus_tool needs raw HEVC stream
extraction = subprocess.run(
[
str(HDR10PlusTool),
"extract",
str(config.directories.temp / "HDR10.hevc"),
"-o",
str(config.directories.temp / self.hdr10plus_file),
],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
if extraction.returncode:
raise ValueError("Failed extracting HDR10+ metadata")
# Check if the extracted file has content
if os.path.getsize(config.directories.temp / self.hdr10plus_file) == 0:
raise ValueError("No HDR10+ metadata found in the stream")
self.log.info("Extracted HDR10+ metadata")
def convert_hdr10plus_to_dv(self):
"""Convert HDR10+ metadata to Dolby Vision RPU"""
if os.path.isfile(config.directories.temp / "RPU.bin"):
return
with console.status("Converting HDR10+ metadata to Dolby Vision...", spinner="dots"):
# First create the extra metadata JSON for dovi_tool
extra_metadata = {
"cm_version": "V29",
"length": 0, # dovi_tool will figure this out
"level6": {
"max_display_mastering_luminance": 1000,
"min_display_mastering_luminance": 1,
"max_content_light_level": 0,
"max_frame_average_light_level": 0,
},
}
with open(config.directories.temp / "extra.json", "w") as f:
json.dump(extra_metadata, f, indent=2)
# Generate DV RPU from HDR10+ metadata
conversion = subprocess.run(
[
str(DoviTool),
"generate",
"-j",
str(config.directories.temp / "extra.json"),
"--hdr10plus-json",
str(config.directories.temp / self.hdr10plus_file),
"-o",
str(config.directories.temp / "RPU.bin"),
],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
if conversion.returncode:
raise ValueError("Failed converting HDR10+ to Dolby Vision")
self.log.info("Converted HDR10+ metadata to Dolby Vision")
self.log.info("✓ HDR10+ successfully converted to Dolby Vision Profile 8")
# Clean up temporary files
Path.unlink(config.directories.temp / "extra.json")
Path.unlink(config.directories.temp / self.hdr10plus_file)

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,727 @@
import base64
import html
import logging
import re
import shutil
import subprocess
from collections import defaultdict
from copy import copy
from enum import Enum
from functools import partial
from pathlib import Path
from typing import Any, Callable, Iterable, Optional, Union
from uuid import UUID
from zlib import crc32
from curl_cffi.requests import Session as CurlSession
from langcodes import Language
from pyplayready.cdm import Cdm as PlayReadyCdm
from pywidevine.cdm import Cdm as WidevineCdm
from requests import Session
from unshackle.core import binaries
from unshackle.core.config import config
from unshackle.core.constants import DOWNLOAD_CANCELLED, DOWNLOAD_LICENCE_ONLY
from unshackle.core.downloaders import aria2c, curl_impersonate, n_m3u8dl_re, requests
from unshackle.core.drm import DRM_T, PlayReady, Widevine
from unshackle.core.events import events
from unshackle.core.utilities import get_boxes, get_extension, try_ensure_utf8
from unshackle.core.utils.subprocess import ffprobe
class Track:
class Descriptor(Enum):
URL = 1 # Direct URL, nothing fancy
HLS = 2 # https://en.wikipedia.org/wiki/HTTP_Live_Streaming
DASH = 3 # https://en.wikipedia.org/wiki/Dynamic_Adaptive_Streaming_over_HTTP
ISM = 4 # https://learn.microsoft.com/en-us/silverlight/smooth-streaming
def __init__(
self,
url: Union[str, list[str]],
language: Union[Language, str],
is_original_lang: bool = False,
descriptor: Descriptor = Descriptor.URL,
needs_repack: bool = False,
name: Optional[str] = None,
drm: Optional[Iterable[DRM_T]] = None,
edition: Optional[str] = None,
downloader: Optional[Callable] = None,
downloader_args: Optional[dict] = None,
from_file: Optional[Path] = None,
data: Optional[Union[dict, defaultdict]] = None,
id_: Optional[str] = None,
extra: Optional[Any] = None,
) -> None:
if not isinstance(url, (str, list)):
raise TypeError(f"Expected url to be a {str}, or list of {str}, not {type(url)}")
if not isinstance(language, (Language, str)):
raise TypeError(f"Expected language to be a {Language} or {str}, not {type(language)}")
if not isinstance(is_original_lang, bool):
raise TypeError(f"Expected is_original_lang to be a {bool}, not {type(is_original_lang)}")
if not isinstance(descriptor, Track.Descriptor):
raise TypeError(f"Expected descriptor to be a {Track.Descriptor}, not {type(descriptor)}")
if not isinstance(needs_repack, bool):
raise TypeError(f"Expected needs_repack to be a {bool}, not {type(needs_repack)}")
if not isinstance(name, (str, type(None))):
raise TypeError(f"Expected name to be a {str}, not {type(name)}")
if not isinstance(id_, (str, type(None))):
raise TypeError(f"Expected id_ to be a {str}, not {type(id_)}")
if not isinstance(edition, (str, type(None))):
raise TypeError(f"Expected edition to be a {str}, not {type(edition)}")
if not isinstance(downloader, (Callable, type(None))):
raise TypeError(f"Expected downloader to be a {Callable}, not {type(downloader)}")
if not isinstance(downloader_args, (dict, type(None))):
raise TypeError(f"Expected downloader_args to be a {dict}, not {type(downloader_args)}")
if not isinstance(from_file, (Path, type(None))):
raise TypeError(f"Expected from_file to be a {Path}, not {type(from_file)}")
if not isinstance(data, (dict, defaultdict, type(None))):
raise TypeError(f"Expected data to be a {dict} or {defaultdict}, not {type(data)}")
invalid_urls = ", ".join(set(type(x) for x in url if not isinstance(x, str)))
if invalid_urls:
raise TypeError(f"Expected all items in url to be a {str}, but found {invalid_urls}")
if drm is not None:
try:
iter(drm)
except TypeError:
raise TypeError(f"Expected drm to be an iterable, not {type(drm)}")
if downloader is None:
downloader = {
"aria2c": aria2c,
"curl_impersonate": curl_impersonate,
"requests": requests,
"n_m3u8dl_re": n_m3u8dl_re,
}[config.downloader]
self.path: Optional[Path] = None
self.url = url
self.language = Language.get(language)
self.is_original_lang = is_original_lang
self.descriptor = descriptor
self.needs_repack = needs_repack
self.name = name
self.drm = drm
self.edition: str = edition
self.downloader = downloader
self.downloader_args = downloader_args
self.from_file = from_file
self._data: defaultdict[Any, Any] = defaultdict(dict)
self.data = data or {}
self.extra: Any = extra or {} # allow anything for extra, but default to a dict
if self.name is None:
lang = Language.get(self.language)
if (lang.language or "").lower() == (lang.territory or "").lower():
lang.territory = None # e.g. en-en, de-DE
reduced = lang.simplify_script()
extra_parts = []
if reduced.script is not None:
script = reduced.script_name(max_distance=25)
if script and script != "Zzzz":
extra_parts.append(script)
if reduced.territory is not None:
territory = reduced.territory_name(max_distance=25)
if territory and territory != "ZZ":
territory = territory.removesuffix(" SAR China")
extra_parts.append(territory)
self.name = ", ".join(extra_parts) or None
if not id_:
this = copy(self)
this.url = self.url.rsplit("?", maxsplit=1)[0]
checksum = crc32(repr(this).encode("utf8"))
id_ = hex(checksum)[2:]
self.id = id_
# TODO: Currently using OnFoo event naming, change to just segment_filter
self.OnSegmentFilter: Optional[Callable] = None
def __repr__(self) -> str:
return "{name}({items})".format(
name=self.__class__.__name__, items=", ".join([f"{k}={repr(v)}" for k, v in self.__dict__.items()])
)
def __eq__(self, other: Any) -> bool:
return isinstance(other, Track) and self.id == other.id
@property
def data(self) -> defaultdict[Any, Any]:
"""
Arbitrary track data dictionary.
A defaultdict is used with a dict as the factory for easier
nested saving and safer exists-checks.
Reserved keys:
- "hls" used by the HLS class.
- playlist: m3u8.model.Playlist - The primary track information.
- media: m3u8.model.Media - The audio/subtitle track information.
- segment_durations: list[int] - A list of each segment's duration.
- "dash" used by the DASH class.
- manifest: lxml.ElementTree - DASH MPD manifest.
- period: lxml.Element - The period of this track.
- adaptation_set: lxml.Element - The adaptation set of this track.
- representation: lxml.Element - The representation of this track.
- timescale: int - The timescale of the track's segments.
- segment_durations: list[int] - A list of each segment's duration.
You should not add, change, or remove any data within reserved keys.
You may use their data but do note that the values of them may change
or be removed at any point.
"""
return self._data
@data.setter
def data(self, value: Union[dict, defaultdict]) -> None:
if not isinstance(value, (dict, defaultdict)):
raise TypeError(f"Expected data to be a {dict} or {defaultdict}, not {type(value)}")
if isinstance(value, dict):
value = defaultdict(dict, **value)
self._data = value
def download(
self,
session: Session,
prepare_drm: partial,
max_workers: Optional[int] = None,
progress: Optional[partial] = None,
*,
cdm: Optional[object] = None,
):
"""Download and optionally Decrypt this Track."""
from unshackle.core.manifests import DASH, HLS, ISM
if DOWNLOAD_LICENCE_ONLY.is_set():
progress(downloaded="[yellow]SKIPPING")
if DOWNLOAD_CANCELLED.is_set():
progress(downloaded="[yellow]SKIPPED")
return
log = logging.getLogger("track")
proxy = next(iter(session.proxies.values()), None)
track_type = self.__class__.__name__
save_path = config.directories.temp / f"{track_type}_{self.id}.mp4"
if track_type == "Subtitle":
save_path = save_path.with_suffix(f".{self.codec.extension}")
# n_m3u8dl_re doesn't support directly downloading subtitles from URLs
# or when the subtitle has a direct file extension
if self.downloader.__name__ == "n_m3u8dl_re" and (
self.descriptor == self.Descriptor.URL
or get_extension(self.url) in {
".srt",
".vtt",
".ttml",
".ssa",
".ass",
".stpp",
".wvtt",
".xml",
}
):
self.downloader = requests
if self.descriptor != self.Descriptor.URL:
save_dir = save_path.with_name(save_path.name + "_segments")
else:
save_dir = save_path.parent
def cleanup():
# track file (e.g., "foo.mp4")
save_path.unlink(missing_ok=True)
# aria2c control file (e.g., "foo.mp4.aria2" or "foo.mp4.aria2__temp")
save_path.with_suffix(f"{save_path.suffix}.aria2").unlink(missing_ok=True)
save_path.with_suffix(f"{save_path.suffix}.aria2__temp").unlink(missing_ok=True)
if save_dir.exists() and save_dir.name.endswith("_segments"):
shutil.rmtree(save_dir)
if not DOWNLOAD_LICENCE_ONLY.is_set():
if config.directories.temp.is_file():
raise ValueError(f"Temp Directory '{config.directories.temp}' must be a Directory, not a file")
config.directories.temp.mkdir(parents=True, exist_ok=True)
# Delete any pre-existing temp files matching this track.
# We can't re-use or continue downloading these tracks as they do not use a
# lock file. Or at least the majority don't. Even if they did I've encountered
# corruptions caused by sudden interruptions to the lock file.
cleanup()
try:
if self.descriptor == self.Descriptor.HLS:
HLS.download_track(
track=self,
save_path=save_path,
save_dir=save_dir,
progress=progress,
session=session,
proxy=proxy,
max_workers=max_workers,
license_widevine=prepare_drm,
cdm=cdm,
)
elif self.descriptor == self.Descriptor.DASH:
DASH.download_track(
track=self,
save_path=save_path,
save_dir=save_dir,
progress=progress,
session=session,
proxy=proxy,
max_workers=max_workers,
license_widevine=prepare_drm,
cdm=cdm,
)
elif self.descriptor == self.Descriptor.ISM:
ISM.download_track(
track=self,
save_path=save_path,
save_dir=save_dir,
progress=progress,
session=session,
proxy=proxy,
max_workers=max_workers,
license_widevine=prepare_drm,
cdm=cdm,
)
elif self.descriptor == self.Descriptor.URL:
try:
if not self.drm and track_type in ("Video", "Audio"):
# the service might not have explicitly defined the `drm` property
# try find widevine DRM information from the init data of URL
try:
self.drm = [Widevine.from_track(self, session)]
except Widevine.Exceptions.PSSHNotFound:
# it might not have Widevine DRM, or might not have found the PSSH
log.warning("No Widevine PSSH was found for this track, is it DRM free?")
if self.drm:
track_kid = self.get_key_id(session=session)
drm = self.get_drm_for_cdm(cdm)
if isinstance(drm, Widevine):
# license and grab content keys
if not prepare_drm:
raise ValueError("prepare_drm func must be supplied to use Widevine DRM")
progress(downloaded="LICENSING")
prepare_drm(drm, track_kid=track_kid)
progress(downloaded="[yellow]LICENSED")
elif isinstance(drm, PlayReady):
# license and grab content keys
if not prepare_drm:
raise ValueError("prepare_drm func must be supplied to use PlayReady DRM")
progress(downloaded="LICENSING")
prepare_drm(drm, track_kid=track_kid)
progress(downloaded="[yellow]LICENSED")
else:
drm = None
if DOWNLOAD_LICENCE_ONLY.is_set():
progress(downloaded="[yellow]SKIPPED")
elif track_type != "Subtitle" and self.downloader.__name__ == "n_m3u8dl_re":
progress(downloaded="[red]FAILED")
error = f"[N_m3u8DL-RE]: {self.descriptor} is currently not supported"
raise ValueError(error)
else:
for status_update in self.downloader(
urls=self.url,
output_dir=save_path.parent,
filename=save_path.name,
headers=session.headers,
cookies=session.cookies,
proxy=proxy,
max_workers=max_workers,
):
file_downloaded = status_update.get("file_downloaded")
if not file_downloaded:
progress(**status_update)
# see https://github.com/devine-dl/devine/issues/71
save_path.with_suffix(f"{save_path.suffix}.aria2__temp").unlink(missing_ok=True)
self.path = save_path
events.emit(events.Types.TRACK_DOWNLOADED, track=self)
if drm:
progress(downloaded="Decrypting", completed=0, total=100)
drm.decrypt(save_path)
self.drm = None
events.emit(events.Types.TRACK_DECRYPTED, track=self, drm=drm, segment=None)
progress(downloaded="Decrypted", completed=100)
if track_type == "Subtitle" and self.codec.name not in ("fVTT", "fTTML"):
track_data = self.path.read_bytes()
track_data = try_ensure_utf8(track_data)
track_data = (
track_data.decode("utf8")
.replace("&lrm;", html.unescape("&lrm;"))
.replace("&rlm;", html.unescape("&rlm;"))
.encode("utf8")
)
self.path.write_bytes(track_data)
progress(downloaded="Downloaded")
except KeyboardInterrupt:
DOWNLOAD_CANCELLED.set()
progress(downloaded="[yellow]CANCELLED")
raise
except Exception:
DOWNLOAD_CANCELLED.set()
progress(downloaded="[red]FAILED")
raise
except (Exception, KeyboardInterrupt):
if not DOWNLOAD_LICENCE_ONLY.is_set():
cleanup()
raise
if DOWNLOAD_CANCELLED.is_set():
# we stopped during the download, let's exit
return
if not DOWNLOAD_LICENCE_ONLY.is_set():
if self.path.stat().st_size <= 3: # Empty UTF-8 BOM == 3 bytes
raise IOError("Download failed, the downloaded file is empty.")
events.emit(events.Types.TRACK_DOWNLOADED, track=self)
def delete(self) -> None:
if self.path:
self.path.unlink()
self.path = None
def move(self, target: Union[Path, str]) -> Path:
"""
Move the Track's file from current location, to target location.
This will overwrite anything at the target path.
Raises:
TypeError: If the target argument is not the expected type.
ValueError: If track has no file to move, or the target does not exist.
OSError: If the file somehow failed to move.
Returns the new location of the track.
"""
if not isinstance(target, (str, Path)):
raise TypeError(f"Expected {target} to be a {Path} or {str}, not {type(target)}")
if not self.path:
raise ValueError("Track has no file to move")
if not isinstance(target, Path):
target = Path(target)
if not target.exists():
raise ValueError(f"Target file {repr(target)} does not exist")
moved_to = Path(shutil.move(self.path, target))
if moved_to.resolve() != target.resolve():
raise OSError(f"Failed to move {self.path} to {target}")
self.path = target
return target
def get_track_name(self) -> Optional[str]:
"""Get the Track Name."""
return self.name
def get_drm_for_cdm(self, cdm: Optional[object]) -> Optional[DRM_T]:
"""Return the DRM matching the provided CDM, if available."""
if not self.drm:
return None
if isinstance(cdm, WidevineCdm):
for drm in self.drm:
if isinstance(drm, Widevine):
return drm
elif isinstance(cdm, PlayReadyCdm):
for drm in self.drm:
if isinstance(drm, PlayReady):
return drm
elif hasattr(cdm, "is_playready"):
if cdm.is_playready:
for drm in self.drm:
if isinstance(drm, PlayReady):
return drm
else:
for drm in self.drm:
if isinstance(drm, Widevine):
return drm
return self.drm[0]
def get_key_id(self, init_data: Optional[bytes] = None, *args, **kwargs) -> Optional[UUID]:
"""
Probe the DRM encryption Key ID (KID) for this specific track.
It currently supports finding the Key ID by probing the track's stream
with ffprobe for `enc_key_id` data, as well as for mp4 `tenc` (Track
Encryption) boxes.
It explicitly ignores PSSH information like the `PSSH` box, as the box
is likely to contain multiple Key IDs that may or may not be for this
specific track.
To retrieve the initialization segment, this method calls :meth:`get_init_segment`
with the positional and keyword arguments. The return value of `get_init_segment`
is then used to determine the Key ID.
Returns:
The Key ID as a UUID object, or None if the Key ID could not be determined.
"""
if not init_data:
init_data = self.get_init_segment(*args, **kwargs)
if not isinstance(init_data, bytes):
raise TypeError(f"Expected init_data to be bytes, not {init_data!r}")
probe = ffprobe(init_data)
if probe:
for stream in probe.get("streams") or []:
enc_key_id = stream.get("tags", {}).get("enc_key_id")
if enc_key_id:
return UUID(bytes=base64.b64decode(enc_key_id))
for tenc in get_boxes(init_data, b"tenc"):
if tenc.key_ID.int != 0:
return tenc.key_ID
for uuid_box in get_boxes(init_data, b"uuid"):
if uuid_box.extended_type == UUID("8974dbce-7be7-4c51-84f9-7148f9882554"): # tenc
tenc = uuid_box.data
if tenc.key_ID.int != 0:
return tenc.key_ID
def load_drm_if_needed(self, service=None) -> bool:
"""
Load DRM information for this track if it was deferred during parsing.
Args:
service: Service instance that can fetch track-specific DRM info
Returns:
True if DRM was loaded or already present, False if failed
"""
if not getattr(self, "needs_drm_loading", False):
return bool(self.drm)
if self.drm:
self.needs_drm_loading = False
return True
if not service or not hasattr(service, "get_track_drm"):
return self.load_drm_from_playlist()
try:
track_drm = service.get_track_drm(self)
if track_drm:
self.drm = track_drm if isinstance(track_drm, list) else [track_drm]
self.needs_drm_loading = False
return True
except Exception as e:
raise ValueError(f"Failed to load DRM from service for track {self.id}: {e}")
return self.load_drm_from_playlist()
def load_drm_from_playlist(self) -> bool:
"""
Fallback method to load DRM by fetching this track's individual playlist.
"""
if self.drm:
self.needs_drm_loading = False
return True
try:
import m3u8
from pyplayready.cdm import Cdm as PlayReadyCdm
from pyplayready.system.pssh import PSSH as PR_PSSH
from pywidevine.cdm import Cdm as WidevineCdm
from pywidevine.pssh import PSSH as WV_PSSH
session = getattr(self, "session", None) or Session()
response = session.get(self.url)
playlist = m3u8.loads(response.text, self.url)
drm_list = []
for key in playlist.keys or []:
if not key or not key.keyformat:
continue
fmt = key.keyformat.lower()
if fmt == WidevineCdm.urn:
pssh_b64 = key.uri.split(",")[-1]
drm = Widevine(pssh=WV_PSSH(pssh_b64))
drm_list.append(drm)
elif fmt == PlayReadyCdm or "com.microsoft.playready" in fmt:
pssh_b64 = key.uri.split(",")[-1]
drm = PlayReady(pssh=PR_PSSH(pssh_b64), pssh_b64=pssh_b64)
drm_list.append(drm)
if drm_list:
self.drm = drm_list
self.needs_drm_loading = False
return True
except Exception as e:
raise ValueError(f"Failed to load DRM from playlist for track {self.id}: {e}")
return False
def get_init_segment(
self,
maximum_size: int = 20000,
url: Optional[str] = None,
byte_range: Optional[str] = None,
session: Optional[Session] = None,
) -> bytes:
"""
Get the Track's Initial Segment Data Stream.
HLS and DASH tracks must explicitly provide a URL to the init segment or file.
Providing the byte-range for the init segment is recommended where possible.
If `byte_range` is not set, it will make a HEAD request and check the size of
the file. If the size could not be determined, it will download up to the first
20KB only, which should contain the entirety of the init segment. You may
override this by changing the `maximum_size`.
The default maximum_size of 20000 (20KB) is a tried-and-tested value that
seems to work well across the board.
Parameters:
maximum_size: Size to assume as the content length if byte-range is not
used, the content size could not be determined, or the content size
is larger than it. A value of 20000 (20KB) or higher is recommended.
url: Explicit init map or file URL to probe from.
byte_range: Range of bytes to download from the explicit or implicit URL.
session: Session context, e.g., authorization and headers.
"""
if not isinstance(maximum_size, int):
raise TypeError(f"Expected maximum_size to be an {int}, not {type(maximum_size)}")
if not isinstance(url, (str, type(None))):
raise TypeError(f"Expected url to be a {str}, not {type(url)}")
if not isinstance(byte_range, (str, type(None))):
raise TypeError(f"Expected byte_range to be a {str}, not {type(byte_range)}")
if not isinstance(session, (Session, CurlSession, type(None))):
raise TypeError(f"Expected session to be a {Session} or {CurlSession}, not {type(session)}")
if not url:
if self.descriptor != self.Descriptor.URL:
raise ValueError(f"An explicit URL must be provided for {self.descriptor.name} tracks")
if not self.url:
raise ValueError("An explicit URL must be provided as the track has no URL")
url = self.url
if not session:
session = Session()
content_length = maximum_size
if byte_range:
if not isinstance(byte_range, str):
raise TypeError(f"Expected byte_range to be a str, not {byte_range!r}")
if not re.match(r"^\d+-\d+$", byte_range):
raise ValueError(f"The value of byte_range is unrecognized: '{byte_range}'")
start, end = byte_range.split("-")
if start > end:
raise ValueError(f"The start range cannot be greater than the end range: {start}>{end}")
else:
size_test = session.head(url)
if "Content-Length" in size_test.headers:
content_length_header = int(size_test.headers["Content-Length"])
if content_length_header > 0:
content_length = min(content_length_header, maximum_size)
range_test = session.head(url, headers={"Range": "bytes=0-1"})
if range_test.status_code == 206:
byte_range = f"0-{content_length - 1}"
if byte_range:
res = session.get(url=url, headers={"Range": f"bytes={byte_range}"})
res.raise_for_status()
init_data = res.content
else:
init_data = None
with session.get(url, stream=True) as s:
for chunk in s.iter_content(content_length):
init_data = chunk
break
if not init_data:
raise ValueError(f"Failed to read {content_length} bytes from the track URI.")
return init_data
def repackage(self) -> None:
if not self.path or not self.path.exists():
raise ValueError("Cannot repackage a Track that has not been downloaded.")
if not binaries.FFMPEG:
raise EnvironmentError('FFmpeg executable "ffmpeg" was not found but is required for this call.')
original_path = self.path
output_path = original_path.with_stem(f"{original_path.stem}_repack")
def _ffmpeg(extra_args: list[str] = None):
args = [
binaries.FFMPEG,
"-hide_banner",
"-loglevel",
"error",
"-i",
original_path,
*(extra_args or []),
]
if hasattr(self, "data") and self.data.get("audio_language"):
audio_lang = self.data["audio_language"]
audio_name = self.data.get("audio_language_name", audio_lang)
args.extend(
[
"-metadata:s:a:0",
f"language={audio_lang}",
"-metadata:s:a:0",
f"title={audio_name}",
"-metadata:s:a:0",
f"handler_name={audio_name}",
]
)
args.extend(
[
# Following are very important!
"-map_metadata",
"-1", # don't transfer metadata to output file
"-fflags",
"bitexact", # only have minimal tag data, reproducible mux
"-codec",
"copy",
str(output_path),
]
)
subprocess.run(
args,
check=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
try:
_ffmpeg()
except subprocess.CalledProcessError as e:
if b"Malformed AAC bitstream detected" in e.stderr:
# e.g., TruTV's dodgy encodes
_ffmpeg(["-y", "-bsf:a", "aac_adtstoasc"])
else:
raise
original_path.unlink()
self.path = output_path
__all__ = ("Track",)

View File

@ -0,0 +1,531 @@
from __future__ import annotations
import logging
import subprocess
from functools import partial
from pathlib import Path
from typing import Callable, Iterator, Optional, Sequence, Union
from langcodes import Language, closest_supported_match
from rich.progress import BarColumn, Progress, SpinnerColumn, TextColumn, TimeRemainingColumn
from rich.table import Table
from rich.tree import Tree
from unshackle.core import binaries
from unshackle.core.config import config
from unshackle.core.console import console
from unshackle.core.constants import LANGUAGE_EXACT_DISTANCE, LANGUAGE_MAX_DISTANCE, AnyTrack, TrackT
from unshackle.core.events import events
from unshackle.core.tracks.attachment import Attachment
from unshackle.core.tracks.audio import Audio
from unshackle.core.tracks.chapters import Chapter, Chapters
from unshackle.core.tracks.subtitle import Subtitle
from unshackle.core.tracks.track import Track
from unshackle.core.tracks.video import Video
from unshackle.core.utilities import is_close_match, sanitize_filename
from unshackle.core.utils.collections import as_list, flatten
class Tracks:
"""
Video, Audio, Subtitle, Chapter, and Attachment Track Store.
It provides convenience functions for listing, sorting, and selecting tracks.
"""
TRACK_ORDER_MAP = {Video: 0, Audio: 1, Subtitle: 2, Chapter: 3, Attachment: 4}
def __init__(
self,
*args: Union[
Tracks, Sequence[Union[AnyTrack, Chapter, Chapters, Attachment]], Track, Chapter, Chapters, Attachment
],
):
self.videos: list[Video] = []
self.audio: list[Audio] = []
self.subtitles: list[Subtitle] = []
self.chapters = Chapters()
self.attachments: list[Attachment] = []
if args:
self.add(args)
def __iter__(self) -> Iterator[AnyTrack]:
return iter(as_list(self.videos, self.audio, self.subtitles))
def __len__(self) -> int:
return len(self.videos) + len(self.audio) + len(self.subtitles)
def __add__(
self,
other: Union[
Tracks, Sequence[Union[AnyTrack, Chapter, Chapters, Attachment]], Track, Chapter, Chapters, Attachment
],
) -> Tracks:
self.add(other)
return self
def __repr__(self) -> str:
return "{name}({items})".format(
name=self.__class__.__name__, items=", ".join([f"{k}={repr(v)}" for k, v in self.__dict__.items()])
)
def __str__(self) -> str:
rep = {Video: [], Audio: [], Subtitle: [], Chapter: [], Attachment: []}
tracks = [*list(self), *self.chapters]
for track in sorted(tracks, key=lambda t: self.TRACK_ORDER_MAP[type(t)]):
if not rep[type(track)]:
count = sum(type(x) is type(track) for x in tracks)
rep[type(track)].append(
"{count} {type} Track{plural}{colon}".format(
count=count,
type=track.__class__.__name__,
plural="s" if count != 1 else "",
colon=":" if count > 0 else "",
)
)
rep[type(track)].append(str(track))
for type_ in list(rep):
if not rep[type_]:
del rep[type_]
continue
rep[type_] = "\n".join([rep[type_][0]] + [f"├─ {x}" for x in rep[type_][1:-1]] + [f"└─ {rep[type_][-1]}"])
rep = "\n".join(list(rep.values()))
return rep
def tree(self, add_progress: bool = False) -> tuple[Tree, list[partial]]:
all_tracks = [*list(self), *self.chapters, *self.attachments]
progress_callables = []
tree = Tree("", hide_root=True)
for track_type in self.TRACK_ORDER_MAP:
tracks = list(x for x in all_tracks if isinstance(x, track_type))
if not tracks:
continue
num_tracks = len(tracks)
track_type_plural = track_type.__name__ + ("s" if track_type != Audio and num_tracks != 1 else "")
tracks_tree = tree.add(f"[repr.number]{num_tracks}[/] {track_type_plural}")
for track in tracks:
if add_progress and track_type not in (Chapter, Attachment):
progress = Progress(
SpinnerColumn(finished_text=""),
BarColumn(),
"",
TimeRemainingColumn(compact=True, elapsed_when_finished=True),
"",
TextColumn("[progress.data.speed]{task.fields[downloaded]}"),
console=console,
speed_estimate_period=10,
)
task = progress.add_task("", downloaded="-")
progress_callables.append(partial(progress.update, task_id=task))
track_table = Table.grid()
track_table.add_row(str(track)[6:], style="text2")
track_table.add_row(progress)
tracks_tree.add(track_table)
else:
tracks_tree.add(str(track)[6:], style="text2")
return tree, progress_callables
def exists(self, by_id: Optional[str] = None, by_url: Optional[Union[str, list[str]]] = None) -> bool:
"""Check if a track already exists by various methods."""
if by_id: # recommended
return any(x.id == by_id for x in self)
if by_url:
return any(x.url == by_url for x in self)
return False
def add(
self,
tracks: Union[
Tracks, Sequence[Union[AnyTrack, Chapter, Chapters, Attachment]], Track, Chapter, Chapters, Attachment
],
warn_only: bool = False,
) -> None:
"""Add a provided track to its appropriate array and ensuring it's not a duplicate."""
if isinstance(tracks, Tracks):
tracks = [*list(tracks), *tracks.chapters, *tracks.attachments]
duplicates = 0
for track in flatten(tracks):
if self.exists(by_id=track.id):
if not warn_only:
raise ValueError(
"One or more of the provided Tracks is a duplicate. "
"Track IDs must be unique but accurate using static values. The "
"value should stay the same no matter when you request the same "
"content. Use a value that has relation to the track content "
"itself and is static or permanent and not random/RNG data that "
"wont change each refresh or conflict in edge cases."
)
duplicates += 1
continue
if isinstance(track, Video):
self.videos.append(track)
elif isinstance(track, Audio):
self.audio.append(track)
elif isinstance(track, Subtitle):
self.subtitles.append(track)
elif isinstance(track, Chapter):
self.chapters.add(track)
elif isinstance(track, Attachment):
self.attachments.append(track)
else:
raise ValueError("Track type was not set or is invalid.")
log = logging.getLogger("Tracks")
if duplicates:
log.debug(f" - Found and skipped {duplicates} duplicate tracks...")
def sort_videos(self, by_language: Optional[Sequence[Union[str, Language]]] = None) -> None:
"""Sort video tracks by bitrate, and optionally language."""
if not self.videos:
return
# bitrate
self.videos.sort(key=lambda x: float(x.bitrate or 0.0), reverse=True)
# language
for language in reversed(by_language or []):
if str(language) in ("all", "best"):
language = next((x.language for x in self.videos if x.is_original_lang), "")
if not language:
continue
self.videos.sort(key=lambda x: str(x.language))
self.videos.sort(key=lambda x: not is_close_match(language, [x.language]))
def sort_audio(self, by_language: Optional[Sequence[Union[str, Language]]] = None) -> None:
"""Sort audio tracks by bitrate, descriptive, and optionally language."""
if not self.audio:
return
# descriptive
self.audio.sort(key=lambda x: x.descriptive)
# bitrate (within each descriptive group)
self.audio.sort(key=lambda x: float(x.bitrate or 0.0), reverse=True)
# language
for language in reversed(by_language or []):
if str(language) in ("all", "best"):
language = next((x.language for x in self.audio if x.is_original_lang), "")
if not language:
continue
self.audio.sort(key=lambda x: not is_close_match(language, [x.language]))
def sort_subtitles(self, by_language: Optional[Sequence[Union[str, Language]]] = None) -> None:
"""
Sort subtitle tracks by various track attributes to a common P2P standard.
You may optionally provide a sequence of languages to prioritize to the top.
Section Order:
- by_language groups prioritized to top, and ascending alphabetically
- then rest ascending alphabetically after the prioritized groups
(Each section ascending alphabetically, but separated)
Language Group Order:
- Forced
- Normal
- Hard of Hearing (SDH/CC)
(Least to most captions expected in the subtitle)
"""
if not self.subtitles:
return
# language groups
self.subtitles.sort(key=lambda x: str(x.language))
self.subtitles.sort(key=lambda x: x.sdh or x.cc)
self.subtitles.sort(key=lambda x: x.forced, reverse=True)
# sections
for language in reversed(by_language or []):
if str(language) == "all":
language = next((x.language for x in self.subtitles if x.is_original_lang), "")
if not language:
continue
self.subtitles.sort(key=lambda x: is_close_match(language, [x.language]), reverse=True)
def select_video(self, x: Callable[[Video], bool]) -> None:
self.videos = list(filter(x, self.videos))
def select_audio(self, x: Callable[[Audio], bool]) -> None:
self.audio = list(filter(x, self.audio))
def select_subtitles(self, x: Callable[[Subtitle], bool]) -> None:
self.subtitles = list(filter(x, self.subtitles))
def select_hybrid(self, tracks, quality):
hdr10_tracks = [
v
for v in tracks
if v.range == Video.Range.HDR10 and (v.height in quality or int(v.width * 9 / 16) in quality)
]
hdr10 = []
for res in quality:
candidates = [v for v in hdr10_tracks if v.height == res or int(v.width * 9 / 16) == res]
if candidates:
best = max(candidates, key=lambda v: v.bitrate) # assumes .bitrate exists
hdr10.append(best)
dv_tracks = [v for v in tracks if v.range == Video.Range.DV]
lowest_dv = min(dv_tracks, key=lambda v: v.height) if dv_tracks else None
def select(x):
if x in hdr10:
return True
if lowest_dv and x is lowest_dv:
return True
return False
return select
def by_resolutions(self, resolutions: list[int], per_resolution: int = 0) -> None:
# Note: Do not merge these list comprehensions. They must be done separately so the results
# from the 16:9 canvas check is only used if there's no exact height resolution match.
selected = []
for resolution in resolutions:
matches = [ # exact matches
x for x in self.videos if x.height == resolution
]
if not matches:
matches = [ # 16:9 canvas matches
x for x in self.videos if int(x.width * (9 / 16)) == resolution
]
selected.extend(matches[: per_resolution or None])
self.videos = selected
@staticmethod
def by_language(
tracks: list[TrackT], languages: list[str], per_language: int = 0, exact_match: bool = False
) -> list[TrackT]:
distance = LANGUAGE_EXACT_DISTANCE if exact_match else LANGUAGE_MAX_DISTANCE
selected = []
for language in languages:
selected.extend(
[x for x in tracks if closest_supported_match(str(x.language), [language], distance)][
: per_language or None
]
)
return selected
def mux(
self,
title: str,
delete: bool = True,
progress: Optional[partial] = None,
audio_expected: bool = True,
title_language: Optional[Language] = None,
) -> tuple[Path, int, list[str]]:
"""
Multiplex all the Tracks into a Matroska Container file.
Parameters:
title: Set the Matroska Container file title. Usually displayed in players
instead of the filename if set.
delete: Delete all track files after multiplexing.
progress: Update a rich progress bar via `completed=...`. This must be the
progress object's update() func, pre-set with task id via functools.partial.
audio_expected: Whether audio is expected in the output. Used to determine
if embedded audio metadata should be added.
title_language: The title's intended language. Used to select the best video track
for audio metadata when multiple video tracks exist.
"""
if self.videos and not self.audio and audio_expected:
video_track = None
if title_language:
video_track = next((v for v in self.videos if v.language == title_language), None)
if not video_track:
video_track = next((v for v in self.videos if v.is_original_lang), None)
video_track = video_track or self.videos[0]
if video_track.language.is_valid():
lang_code = str(video_track.language)
lang_name = video_track.language.display_name()
for video in self.videos:
video.needs_repack = True
video.data["audio_language"] = lang_code
video.data["audio_language_name"] = lang_name
if not binaries.MKVToolNix:
raise RuntimeError("MKVToolNix (mkvmerge) is required for muxing but was not found")
cl = [
str(binaries.MKVToolNix),
"--no-date", # remove dates from the output for security
]
if config.muxing.get("set_title", True):
cl.extend(["--title", title])
for i, vt in enumerate(self.videos):
if not vt.path or not vt.path.exists():
raise ValueError("Video Track must be downloaded before muxing...")
events.emit(events.Types.TRACK_MULTIPLEX, track=vt)
is_default = False
if title_language:
is_default = vt.language == title_language
if not any(v.language == title_language for v in self.videos):
is_default = vt.is_original_lang or i == 0
else:
is_default = i == 0
# Prepare base arguments
video_args = [
"--language",
f"0:{vt.language}",
"--default-track",
f"0:{is_default}",
"--original-flag",
f"0:{vt.is_original_lang}",
"--compression",
"0:none", # disable extra compression
]
# Add FPS fix if needed (typically for hybrid mode to prevent sync issues)
if hasattr(vt, "needs_duration_fix") and vt.needs_duration_fix and vt.fps:
video_args.extend(
[
"--default-duration",
f"0:{vt.fps}fps" if isinstance(vt.fps, str) else f"0:{vt.fps:.3f}fps",
"--fix-bitstream-timing-information",
"0:1",
]
)
if hasattr(vt, "range") and vt.range == Video.Range.HLG:
video_args.extend(
[
"--color-transfer-characteristics",
"0:18", # ARIB STD-B67 (HLG)
]
)
if hasattr(vt, "data") and vt.data.get("audio_language"):
audio_lang = vt.data["audio_language"]
audio_name = vt.data.get("audio_language_name", audio_lang)
video_args.extend(
[
"--language",
f"1:{audio_lang}",
"--track-name",
f"1:{audio_name}",
]
)
cl.extend(video_args + ["(", str(vt.path), ")"])
for i, at in enumerate(self.audio):
if not at.path or not at.path.exists():
raise ValueError("Audio Track must be downloaded before muxing...")
events.emit(events.Types.TRACK_MULTIPLEX, track=at)
cl.extend(
[
"--track-name",
f"0:{at.get_track_name() or ''}",
"--language",
f"0:{at.language}",
"--default-track",
f"0:{at.is_original_lang}",
"--visual-impaired-flag",
f"0:{at.descriptive}",
"--original-flag",
f"0:{at.is_original_lang}",
"--compression",
"0:none", # disable extra compression
"(",
str(at.path),
")",
]
)
for st in self.subtitles:
if not st.path or not st.path.exists():
raise ValueError("Text Track must be downloaded before muxing...")
events.emit(events.Types.TRACK_MULTIPLEX, track=st)
default = bool(self.audio and is_close_match(st.language, [self.audio[0].language]) and st.forced)
cl.extend(
[
"--track-name",
f"0:{st.get_track_name() or ''}",
"--language",
f"0:{st.language}",
"--sub-charset",
"0:UTF-8",
"--forced-track",
f"0:{st.forced}",
"--default-track",
f"0:{default}",
"--hearing-impaired-flag",
f"0:{st.sdh}",
"--original-flag",
f"0:{st.is_original_lang}",
"--compression",
"0:none", # disable extra compression (probably zlib)
"(",
str(st.path),
")",
]
)
if self.chapters:
chapters_path = config.directories.temp / config.filenames.chapters.format(
title=sanitize_filename(title), random=self.chapters.id
)
self.chapters.dump(chapters_path, fallback_name=config.chapter_fallback_name)
cl.extend(["--chapter-charset", "UTF-8", "--chapters", str(chapters_path)])
else:
chapters_path = None
for attachment in self.attachments:
if not attachment.path or not attachment.path.exists():
raise ValueError("Attachment File was not found...")
cl.extend(
[
"--attachment-description",
attachment.description or "",
"--attachment-mime-type",
attachment.mime_type,
"--attachment-name",
attachment.name,
"--attach-file",
str(attachment.path.resolve()),
]
)
output_path = (
self.videos[0].path.with_suffix(".muxed.mkv")
if self.videos
else self.audio[0].path.with_suffix(".muxed.mka")
if self.audio
else self.subtitles[0].path.with_suffix(".muxed.mks")
if self.subtitles
else chapters_path.with_suffix(".muxed.mkv")
if self.chapters
else None
)
if not output_path:
raise ValueError("No tracks provided, at least one track must be provided.")
# let potential failures go to caller, caller should handle
try:
errors = []
p = subprocess.Popen([*cl, "--output", str(output_path), "--gui-mode"], text=True, stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, ""):
if line.startswith("#GUI#error") or line.startswith("#GUI#warning"):
errors.append(line)
if "progress" in line:
progress(total=100, completed=int(line.strip()[14:-1]))
return output_path, p.wait(), errors
finally:
if chapters_path:
chapters_path.unlink()
if delete:
for track in self:
track.delete()
for attachment in self.attachments:
if attachment.path and attachment.path.exists():
attachment.path.unlink()
__all__ = ("Tracks",)

View File

@ -0,0 +1,473 @@
from __future__ import annotations
import logging
import math
import re
import subprocess
from enum import Enum
from pathlib import Path
from typing import Any, Optional, Union
from langcodes import Language
from unshackle.core import binaries
from unshackle.core.config import config
from unshackle.core.tracks.subtitle import Subtitle
from unshackle.core.tracks.track import Track
from unshackle.core.utilities import FPS, get_boxes
class Video(Track):
class Codec(str, Enum):
AVC = "H.264"
HEVC = "H.265"
VC1 = "VC-1"
VP8 = "VP8"
VP9 = "VP9"
AV1 = "AV1"
@property
def extension(self) -> str:
return self.value.lower().replace(".", "").replace("-", "")
@staticmethod
def from_mime(mime: str) -> Video.Codec:
mime = mime.lower().strip().split(".")[0]
if mime in (
"avc1",
"avc2",
"avc3",
"dva1",
"dvav", # Dolby Vision
):
return Video.Codec.AVC
if mime in (
"hev1",
"hev2",
"hev3",
"hvc1",
"hvc2",
"hvc3",
"dvh1",
"dvhe", # Dolby Vision
"lhv1",
"lhe1", # Layered
):
return Video.Codec.HEVC
if mime == "vc-1":
return Video.Codec.VC1
if mime in ("vp08", "vp8"):
return Video.Codec.VP8
if mime in ("vp09", "vp9"):
return Video.Codec.VP9
if mime == "av01":
return Video.Codec.AV1
raise ValueError(f"The MIME '{mime}' is not a supported Video Codec")
@staticmethod
def from_codecs(codecs: str) -> Video.Codec:
for codec in codecs.lower().split(","):
codec = codec.strip()
mime = codec.split(".")[0]
try:
return Video.Codec.from_mime(mime)
except ValueError:
pass
raise ValueError(f"No MIME types matched any supported Video Codecs in '{codecs}'")
@staticmethod
def from_netflix_profile(profile: str) -> Video.Codec:
profile = profile.lower().strip()
if profile.startswith(("h264", "playready-h264")):
return Video.Codec.AVC
if profile.startswith("hevc"):
return Video.Codec.HEVC
if profile.startswith("vp9"):
return Video.Codec.VP9
if profile.startswith("av1"):
return Video.Codec.AV1
raise ValueError(f"The Content Profile '{profile}' is not a supported Video Codec")
class Range(str, Enum):
SDR = "SDR" # No Dynamic Range
HLG = "HLG" # https://en.wikipedia.org/wiki/Hybrid_log%E2%80%93gamma
HDR10 = "HDR10" # https://en.wikipedia.org/wiki/HDR10
HDR10P = "HDR10+" # https://en.wikipedia.org/wiki/HDR10%2B
DV = "DV" # https://en.wikipedia.org/wiki/Dolby_Vision
HYBRID = "HYBRID" # Selects both HDR10 and DV tracks for hybrid processing with DoviTool
@staticmethod
def from_cicp(primaries: int, transfer: int, matrix: int) -> Video.Range:
"""
Convert CICP (Coding-Independent Code Points) values to Video Range.
CICP is defined in ITU-T H.273 and ISO/IEC 23091-2 for signaling video
color properties independently of the compression codec. These values are
used across AVC (H.264), HEVC (H.265), VVC, AV1, and other modern codecs.
The enum values (Primaries, Transfer, Matrix) match the official specifications:
- ITU-T H.273: Coding-independent code points for video signal type identification
- ISO/IEC 23091-2: Information technology Coding-independent code points Part 2: Video
- H.264 Table E-3 (Colour Primaries) and Table E-4 (Transfer Characteristics)
- H.265 Table E.3 and E.4 (identical to H.264)
Note: Value 0 = "Reserved" and Value 2 = "Unspecified" per specification.
While both effectively mean "unknown" in practice, the distinction matters for
spec compliance. Value 2 was added based on user feedback (GitHub issue) and
verified against FFmpeg's AVColorPrimaries/AVColorTransferCharacteristic enums.
Sources:
- https://www.itu.int/rec/T-REC-H.273
- https://www.itu.int/rec/T-REC-H.Sup19-202104-I
- https://github.com/FFmpeg/FFmpeg/blob/master/libavutil/pixfmt.h
"""
class Primaries(Enum):
Reserved = 0
BT_709 = 1
Unspecified = 2
BT_601_625 = 5
BT_601_525 = 6
BT_2020_and_2100 = 9
SMPTE_ST_2113_and_EG_4321 = 12 # P3D65
class Transfer(Enum):
Reserved = 0
BT_709 = 1
Unspecified = 2
BT_601 = 6
BT_2020 = 14
BT_2100 = 15
BT_2100_PQ = 16
BT_2100_HLG = 18
class Matrix(Enum):
RGB = 0
YCbCr_BT_709 = 1
YCbCr_BT_601_625 = 5
YCbCr_BT_601_525 = 6
YCbCr_BT_2020_and_2100 = 9 # YCbCr BT.2100 shares the same CP
ICtCp_BT_2100 = 14
if transfer == 5:
# While not part of any standard, it is typically used as a PAL variant of Transfer.BT_601=6.
# i.e. where Transfer 6 would be for BT.601-NTSC and Transfer 5 would be for BT.601-PAL.
# The codebase is currently agnostic to either, so a manual conversion to 6 is done.
transfer = 6
primaries = Primaries(primaries)
transfer = Transfer(transfer)
matrix = Matrix(matrix)
# primaries and matrix does not strictly correlate to a range
if (primaries, transfer, matrix) == (Primaries.Reserved, Transfer.Reserved, Matrix.RGB):
return Video.Range.SDR
elif primaries in (Primaries.BT_601_625, Primaries.BT_601_525):
return Video.Range.SDR
elif transfer == Transfer.BT_2100_PQ:
return Video.Range.HDR10
elif transfer == Transfer.BT_2100_HLG:
return Video.Range.HLG
else:
return Video.Range.SDR
@staticmethod
def from_m3u_range_tag(tag: str) -> Optional[Video.Range]:
tag = (tag or "").upper().replace('"', "").strip()
if not tag:
return None
if tag == "SDR":
return Video.Range.SDR
elif tag == "PQ":
return Video.Range.HDR10 # technically could be any PQ-transfer range
elif tag == "HLG":
return Video.Range.HLG
# for some reason there's no Dolby Vision info tag
raise ValueError(f"The M3U Range Tag '{tag}' is not a supported Video Range")
def __init__(
self,
*args: Any,
codec: Optional[Video.Codec] = None,
range_: Optional[Video.Range] = None,
bitrate: Optional[Union[str, int, float]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
fps: Optional[Union[str, int, float]] = None,
**kwargs: Any,
) -> None:
"""
Create a new Video track object.
Parameters:
codec: A Video.Codec enum representing the video codec.
If not specified, MediaInfo will be used to retrieve the codec
once the track has been downloaded.
range_: A Video.Range enum representing the video color range.
Defaults to SDR if not specified.
bitrate: A number or float representing the average bandwidth in bytes/s.
Float values are rounded up to the nearest integer.
width: The horizontal resolution of the video.
height: The vertical resolution of the video.
fps: A number, float, or string representing the frames/s of the video.
Strings may represent numbers, floats, or a fraction (num/den).
All strings will be cast to either a number or float.
Note: If codec, bitrate, width, height, or fps is not specified some checks
may be skipped or assume a value. Specifying as much information as possible
is highly recommended.
"""
super().__init__(*args, **kwargs)
if not isinstance(codec, (Video.Codec, type(None))):
raise TypeError(f"Expected codec to be a {Video.Codec}, not {codec!r}")
if not isinstance(range_, (Video.Range, type(None))):
raise TypeError(f"Expected range_ to be a {Video.Range}, not {range_!r}")
if not isinstance(bitrate, (str, int, float, type(None))):
raise TypeError(f"Expected bitrate to be a {str}, {int}, or {float}, not {bitrate!r}")
if not isinstance(width, (int, str, type(None))):
raise TypeError(f"Expected width to be a {int}, not {width!r}")
if not isinstance(height, (int, str, type(None))):
raise TypeError(f"Expected height to be a {int}, not {height!r}")
if not isinstance(fps, (str, int, float, type(None))):
raise TypeError(f"Expected fps to be a {str}, {int}, or {float}, not {fps!r}")
self.codec = codec
self.range = range_ or Video.Range.SDR
try:
self.bitrate = int(math.ceil(float(bitrate))) if bitrate else None
except (ValueError, TypeError) as e:
raise ValueError(f"Expected bitrate to be a number or float, {e}")
try:
self.width = int(width or 0) or None
except ValueError as e:
raise ValueError(f"Expected width to be a number, not {width!r}, {e}")
try:
self.height = int(height or 0) or None
except ValueError as e:
raise ValueError(f"Expected height to be a number, not {height!r}, {e}")
try:
self.fps = (FPS.parse(str(fps)) or None) if fps else None
except Exception as e:
raise ValueError("Expected fps to be a number, float, or a string as numerator/denominator form, " + str(e))
self.needs_duration_fix = False
def __str__(self) -> str:
return " | ".join(
filter(
bool,
[
"VID",
"[" + (", ".join(filter(bool, [self.codec.value if self.codec else None, self.range.name]))) + "]",
str(self.language),
", ".join(
filter(
bool,
[
" @ ".join(
filter(
bool,
[
f"{self.width}x{self.height}" if self.width and self.height else None,
f"{self.bitrate // 1000} kb/s" if self.bitrate else None,
],
)
),
f"{self.fps:.3f} FPS" if self.fps else None,
],
)
),
self.edition,
],
)
)
def change_color_range(self, range_: int) -> None:
"""Change the Video's Color Range to Limited (0) or Full (1)."""
if not self.path or not self.path.exists():
raise ValueError("Cannot change the color range flag on a Video that has not been downloaded.")
if not self.codec:
raise ValueError("Cannot change the color range flag on a Video that has no codec specified.")
if self.codec not in (Video.Codec.AVC, Video.Codec.HEVC):
raise NotImplementedError(
"Cannot change the color range flag on this Video as "
f"it's codec, {self.codec.value}, is not yet supported."
)
if not binaries.FFMPEG:
raise EnvironmentError('FFmpeg executable "ffmpeg" was not found but is required for this call.')
filter_key = {Video.Codec.AVC: "h264_metadata", Video.Codec.HEVC: "hevc_metadata"}[self.codec]
original_path = self.path
output_path = original_path.with_stem(f"{original_path.stem}_{['limited', 'full'][range_]}_range")
subprocess.run(
[
binaries.FFMPEG,
"-hide_banner",
"-loglevel",
"panic",
"-i",
original_path,
"-codec",
"copy",
"-bsf:v",
f"{filter_key}=video_full_range_flag={range_}",
str(output_path),
],
check=True,
)
self.path = output_path
original_path.unlink()
def ccextractor(
self, track_id: Any, out_path: Union[Path, str], language: Language, original: bool = False
) -> Optional[Subtitle]:
"""Return a TextTrack object representing CC track extracted by CCExtractor."""
if not self.path:
raise ValueError("You must download the track first.")
if not binaries.CCExtractor:
raise EnvironmentError("ccextractor executable was not found.")
# ccextractor often fails in weird ways unless we repack
self.repackage()
out_path = Path(out_path)
try:
subprocess.run(
[binaries.CCExtractor, "-trim", "-nobom", "-noru", "-ru1", "-o", out_path, self.path],
check=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
except subprocess.CalledProcessError as e:
out_path.unlink(missing_ok=True)
if not e.returncode == 10: # No captions found
raise
if out_path.exists():
cc_track = Subtitle(
id_=track_id,
url="", # doesn't need to be downloaded
codec=Subtitle.Codec.SubRip,
language=language,
is_original_lang=original,
cc=True,
)
cc_track.path = out_path
return cc_track
return None
def extract_c608(self) -> list[Subtitle]:
"""
Extract Apple-Style c608 box (CEA-608) subtitle using ccextractor.
This isn't much more than a wrapper to the track.ccextractor function.
All this does, is actually check if a c608 box exists and only if so
does it actually call ccextractor.
Even though there is a possibility of more than one c608 box, only one
can actually be extracted. Not only that but it's very possible this
needs to be done before any decryption as the decryption may destroy
some of the metadata.
TODO: Need a test file with more than one c608 box to add support for
more than one CEA-608 extraction.
"""
if not self.path:
raise ValueError("You must download the track first.")
with self.path.open("rb") as f:
# assuming 20KB is enough to contain the c608 box.
# ffprobe will fail, so a c608 box check must be done.
c608_count = len(list(get_boxes(f.read(20000), b"c608")))
if c608_count > 0:
# TODO: Figure out the real language, it might be different
# CEA-608 boxes doesnt seem to carry language information :(
# TODO: Figure out if the CC language is original lang or not.
# Will need to figure out above first to do so.
track_id = f"ccextractor-{self.id}"
cc_lang = self.language
cc_track = self.ccextractor(
track_id=track_id,
out_path=config.directories.temp / config.filenames.subtitle.format(id=track_id, language=cc_lang),
language=cc_lang,
original=False,
)
if not cc_track:
return []
return [cc_track]
return []
def remove_eia_cc(self) -> bool:
"""
Remove EIA-CC data from Bitstream while keeping SEI data.
This works by removing all NAL Unit's with the Type of 6 from the bistream
and then re-adding SEI data (effectively a new NAL Unit with just the SEI data).
Only bitstreams with x264 encoding information is currently supported due to the
obscurity on the MDAT mp4 box structure. Therefore, we need to use hacky regex.
"""
if not self.path or not self.path.exists():
raise ValueError("Cannot clean a Track that has not been downloaded.")
if not binaries.FFMPEG:
raise EnvironmentError('FFmpeg executable "ffmpeg" was not found but is required for this call.')
log = logging.getLogger("x264-clean")
log.info("Removing EIA-CC from Video Track with FFMPEG")
with open(self.path, "rb") as f:
file = f.read(60000)
x264 = re.search(rb"(.{16})(x264)", file)
if not x264:
log.info(" - No x264 encode settings were found, unsupported...")
return False
uuid = x264.group(1).hex()
i = file.index(b"x264")
encoding_settings = file[i : i + file[i:].index(b"\x00")].replace(b":", rb"\\:").replace(b",", rb"\,").decode()
original_path = self.path
cleaned_path = original_path.with_suffix(f".cleaned{original_path.suffix}")
subprocess.run(
[
binaries.FFMPEG,
"-hide_banner",
"-loglevel",
"panic",
"-i",
original_path,
"-map_metadata",
"-1",
"-fflags",
"bitexact",
"-bsf:v",
f"filter_units=remove_types=6,h264_metadata=sei_user_data={uuid}+{encoding_settings}",
"-codec",
"copy",
str(cleaned_path),
],
check=True,
)
log.info(" + Removed")
self.path = cleaned_path
original_path.unlink()
return True
__all__ = ("Video",)

View File

@ -0,0 +1,276 @@
from __future__ import annotations
import asyncio
import json
import time
from pathlib import Path
from typing import Optional
import requests
class UpdateChecker:
"""
Check for available updates from the GitHub repository.
This class provides functionality to check for newer versions of the application
by querying the GitHub releases API. It includes rate limiting, caching, and
both synchronous and asynchronous interfaces.
Attributes:
REPO_URL: GitHub API URL for latest release
TIMEOUT: Request timeout in seconds
DEFAULT_CHECK_INTERVAL: Default time between checks in seconds (24 hours)
"""
REPO_URL = "https://api.github.com/repos/unshackle-dl/unshackle/releases/latest"
TIMEOUT = 5
DEFAULT_CHECK_INTERVAL = 24 * 60 * 60
@classmethod
def get_cache_file(cls) -> Path:
"""Get the path to the update check cache file."""
from unshackle.core.config import config
return config.directories.cache / "update_check.json"
@classmethod
def load_cache_data(cls) -> dict:
"""
Load cache data from file.
Returns:
Cache data dictionary or empty dict if loading fails
"""
cache_file = cls.get_cache_file()
if not cache_file.exists():
return {}
try:
with open(cache_file, "r") as f:
return json.load(f)
except (json.JSONDecodeError, OSError):
return {}
@staticmethod
def parse_version(version_string: str) -> str:
"""
Parse and normalize version string by removing 'v' prefix.
Args:
version_string: Raw version string from API
Returns:
Cleaned version string
"""
return version_string.lstrip("v")
@staticmethod
def _is_valid_version(version: str) -> bool:
"""
Validate version string format.
Args:
version: Version string to validate
Returns:
True if version string is valid semantic version, False otherwise
"""
if not version or not isinstance(version, str):
return False
try:
parts = version.split(".")
if len(parts) < 2:
return False
for part in parts:
int(part)
return True
except (ValueError, AttributeError):
return False
@classmethod
def _fetch_latest_version(cls) -> Optional[str]:
"""
Fetch the latest version from GitHub API.
Returns:
Latest version string if successful, None otherwise
"""
try:
response = requests.get(cls.REPO_URL, timeout=cls.TIMEOUT)
if response.status_code != 200:
return None
data = response.json()
latest_version = cls.parse_version(data.get("tag_name", ""))
return latest_version if cls._is_valid_version(latest_version) else None
except Exception:
return None
@classmethod
def _should_check_for_updates(cls, check_interval: int = DEFAULT_CHECK_INTERVAL) -> bool:
"""
Check if enough time has passed since the last update check.
Args:
check_interval: Time in seconds between checks (default: 24 hours)
Returns:
True if we should check for updates, False otherwise
"""
cache_data = cls.load_cache_data()
if not cache_data:
return True
last_check = cache_data.get("last_check", 0)
current_time = time.time()
return (current_time - last_check) >= check_interval
@classmethod
def _update_cache(cls, latest_version: Optional[str] = None, current_version: Optional[str] = None) -> None:
"""
Update the cache file with the current timestamp and version info.
Args:
latest_version: The latest version found, if any
current_version: The current version being used
"""
cache_file = cls.get_cache_file()
try:
cache_file.parent.mkdir(parents=True, exist_ok=True)
cache_data = {
"last_check": time.time(),
"latest_version": latest_version,
"current_version": current_version,
}
with open(cache_file, "w") as f:
json.dump(cache_data, f, indent=2)
except (OSError, json.JSONEncodeError):
pass
@staticmethod
def _compare_versions(current: str, latest: str) -> bool:
"""
Simple semantic version comparison.
Args:
current: Current version string (e.g., "1.1.0")
latest: Latest version string (e.g., "1.2.0")
Returns:
True if latest > current, False otherwise
"""
if not UpdateChecker._is_valid_version(current) or not UpdateChecker._is_valid_version(latest):
return False
try:
current_parts = [int(x) for x in current.split(".")]
latest_parts = [int(x) for x in latest.split(".")]
max_length = max(len(current_parts), len(latest_parts))
current_parts.extend([0] * (max_length - len(current_parts)))
latest_parts.extend([0] * (max_length - len(latest_parts)))
for current_part, latest_part in zip(current_parts, latest_parts):
if latest_part > current_part:
return True
elif latest_part < current_part:
return False
return False
except (ValueError, AttributeError):
return False
@classmethod
async def check_for_updates(cls, current_version: str) -> Optional[str]:
"""
Check if there's a newer version available on GitHub.
Args:
current_version: The current version string (e.g., "1.1.0")
Returns:
The latest version string if an update is available, None otherwise
"""
if not cls._is_valid_version(current_version):
return None
try:
loop = asyncio.get_event_loop()
latest_version = await loop.run_in_executor(None, cls._fetch_latest_version)
if latest_version and cls._compare_versions(current_version, latest_version):
return latest_version
except Exception:
pass
return None
@classmethod
def _get_cached_update_info(cls, current_version: str) -> Optional[str]:
"""
Check if there's a cached update available for the current version.
Args:
current_version: The current version string
Returns:
The latest version string if an update is available from cache, None otherwise
"""
cache_data = cls.load_cache_data()
if not cache_data:
return None
cached_current = cache_data.get("current_version")
cached_latest = cache_data.get("latest_version")
if cached_current == current_version and cached_latest:
if cls._compare_versions(current_version, cached_latest):
return cached_latest
return None
@classmethod
def check_for_updates_sync(cls, current_version: str, check_interval: Optional[int] = None) -> Optional[str]:
"""
Synchronous version of update check with rate limiting.
Args:
current_version: The current version string (e.g., "1.1.0")
check_interval: Time in seconds between checks (default: from config)
Returns:
The latest version string if an update is available, None otherwise
"""
if not cls._is_valid_version(current_version):
return None
if check_interval is None:
from unshackle.core.config import config
check_interval = config.update_check_interval * 60 * 60
if not cls._should_check_for_updates(check_interval):
return cls._get_cached_update_info(current_version)
latest_version = cls._fetch_latest_version()
cls._update_cache(latest_version, current_version)
if latest_version and cls._compare_versions(current_version, latest_version):
return latest_version
return None

1062
unshackle/core/utilities.py Normal file

File diff suppressed because it is too large Load Diff

View File

View File

@ -0,0 +1,292 @@
import re
from typing import Any, Optional, Union
import click
from click.shell_completion import CompletionItem
from pywidevine.cdm import Cdm as WidevineCdm
class VideoCodecChoice(click.Choice):
"""
A custom Choice type for video codecs that accepts both enum names and values.
Accepts both:
- Enum names: avc, hevc, vc1, vp8, vp9, av1
- Enum values: H.264, H.265, VC-1, VP8, VP9, AV1
"""
def __init__(self, codec_enum):
self.codec_enum = codec_enum
# Build choices from both enum names and values
choices = []
for codec in codec_enum:
choices.append(codec.name.lower()) # e.g., "avc", "hevc"
choices.append(codec.value) # e.g., "H.264", "H.265"
super().__init__(choices, case_sensitive=False)
def convert(self, value: Any, param: Optional[click.Parameter] = None, ctx: Optional[click.Context] = None):
if not value:
return None
# First try to convert using the parent class
converted_value = super().convert(value, param, ctx)
# Now map the converted value back to the enum
for codec in self.codec_enum:
if converted_value.lower() == codec.name.lower():
return codec
if converted_value == codec.value:
return codec
# This shouldn't happen if the parent conversion worked
self.fail(f"'{value}' is not a valid video codec", param, ctx)
class SubtitleCodecChoice(click.Choice):
"""
A custom Choice type for subtitle codecs that accepts both enum names, values, and common aliases.
Accepts:
- Enum names: subrip, substationalpha, substationalphav4, timedtextmarkuplang, webvtt, ftml, fvtt
- Enum values: SRT, SSA, ASS, TTML, VTT, STPP, WVTT
- Common aliases: srt (for SubRip)
"""
def __init__(self, codec_enum):
self.codec_enum = codec_enum
# Build choices from enum names, values, and common aliases
choices = []
aliases = {}
for codec in codec_enum:
choices.append(codec.name.lower()) # e.g., "subrip", "webvtt"
# Only add the value if it's different from common aliases
value_lower = codec.value.lower()
# Add common aliases and track them
if codec.name == "SubRip":
if "srt" not in choices:
choices.append("srt")
aliases["srt"] = codec
elif codec.name == "WebVTT":
if "vtt" not in choices:
choices.append("vtt")
aliases["vtt"] = codec
# Also add the enum value if different
if value_lower != "vtt" and value_lower not in choices:
choices.append(value_lower)
elif codec.name == "SubStationAlpha":
if "ssa" not in choices:
choices.append("ssa")
aliases["ssa"] = codec
# Also add the enum value if different
if value_lower != "ssa" and value_lower not in choices:
choices.append(value_lower)
elif codec.name == "SubStationAlphav4":
if "ass" not in choices:
choices.append("ass")
aliases["ass"] = codec
# Also add the enum value if different
if value_lower != "ass" and value_lower not in choices:
choices.append(value_lower)
elif codec.name == "TimedTextMarkupLang":
if "ttml" not in choices:
choices.append("ttml")
aliases["ttml"] = codec
# Also add the enum value if different
if value_lower != "ttml" and value_lower not in choices:
choices.append(value_lower)
else:
# For other codecs, just add the enum value
if value_lower not in choices:
choices.append(value_lower)
self.aliases = aliases
super().__init__(choices, case_sensitive=False)
def convert(self, value: Any, param: Optional[click.Parameter] = None, ctx: Optional[click.Context] = None):
if not value:
return None
# First try to convert using the parent class
converted_value = super().convert(value, param, ctx)
# Check aliases first
if converted_value.lower() in self.aliases:
return self.aliases[converted_value.lower()]
# Now map the converted value back to the enum
for codec in self.codec_enum:
if converted_value.lower() == codec.name.lower():
return codec
if converted_value.lower() == codec.value.lower():
return codec
# This shouldn't happen if the parent conversion worked
self.fail(f"'{value}' is not a valid subtitle codec", param, ctx)
class ContextData:
def __init__(self, config: dict, cdm: WidevineCdm, proxy_providers: list, profile: Optional[str] = None):
self.config = config
self.cdm = cdm
self.proxy_providers = proxy_providers
self.profile = profile
class SeasonRange(click.ParamType):
name = "ep_range"
MIN_EPISODE = 0
MAX_EPISODE = 999
def parse_tokens(self, *tokens: str) -> list[str]:
"""
Parse multiple tokens or ranged tokens as '{s}x{e}' strings.
Supports exclusioning by putting a `-` before the token.
Example:
>>> sr = SeasonRange()
>>> sr.parse_tokens("S01E01")
["1x1"]
>>> sr.parse_tokens("S02E01", "S02E03-S02E05")
["2x1", "2x3", "2x4", "2x5"]
>>> sr.parse_tokens("S01-S05", "-S03", "-S02E01")
["1x0", "1x1", ..., "2x0", (...), "2x2", (...), "4x0", ..., "5x0", ...]
"""
if len(tokens) == 0:
return []
computed: list = []
exclusions: list = []
for token in tokens:
exclude = token.startswith("-")
if exclude:
token = token[1:]
parsed = [
re.match(r"^S(?P<season>\d+)(E(?P<episode>\d+))?$", x, re.IGNORECASE) for x in re.split(r"[:-]", token)
]
if len(parsed) > 2:
self.fail(f"Invalid token, only a left and right range is acceptable: {token}")
if len(parsed) == 1:
parsed.append(parsed[0])
if any(x is None for x in parsed):
self.fail(f"Invalid token, syntax error occurred: {token}")
from_season, from_episode = [
int(v) if v is not None else self.MIN_EPISODE
for k, v in parsed[0].groupdict().items()
if parsed[0] # type: ignore[union-attr]
]
to_season, to_episode = [
int(v) if v is not None else self.MAX_EPISODE
for k, v in parsed[1].groupdict().items()
if parsed[1] # type: ignore[union-attr]
]
if from_season > to_season:
self.fail(f"Invalid range, left side season cannot be bigger than right side season: {token}")
if from_season == to_season and from_episode > to_episode:
self.fail(f"Invalid range, left side episode cannot be bigger than right side episode: {token}")
for s in range(from_season, to_season + 1):
for e in range(
from_episode if s == from_season else 0, (self.MAX_EPISODE if s < to_season else to_episode) + 1
):
(computed if not exclude else exclusions).append(f"{s}x{e}")
for exclusion in exclusions:
if exclusion in computed:
computed.remove(exclusion)
return list(set(computed))
def convert(
self, value: str, param: Optional[click.Parameter] = None, ctx: Optional[click.Context] = None
) -> list[str]:
return self.parse_tokens(*re.split(r"\s*[,;]\s*", value))
class LanguageRange(click.ParamType):
name = "lang_range"
def convert(
self, value: Union[str, list], param: Optional[click.Parameter] = None, ctx: Optional[click.Context] = None
) -> list[str]:
if isinstance(value, list):
return value
if not value:
return []
return re.split(r"\s*[,;]\s*", value)
class QualityList(click.ParamType):
name = "quality_list"
def convert(
self, value: Union[str, list[str]], param: Optional[click.Parameter] = None, ctx: Optional[click.Context] = None
) -> list[int]:
if not value:
return []
if not isinstance(value, list):
value = value.split(",")
resolutions = []
for resolution in value:
try:
resolutions.append(int(resolution.lower().rstrip("p")))
except TypeError:
self.fail(
f"Expected string for int() conversion, got {resolution!r} of type {type(resolution).__name__}",
param,
ctx,
)
except ValueError:
self.fail(f"{resolution!r} is not a valid integer", param, ctx)
return sorted(resolutions, reverse=True)
class MultipleChoice(click.Choice):
"""
The multiple choice type allows multiple values to be checked against
a fixed set of supported values.
It internally uses and is based off of click.Choice.
"""
name = "multiple_choice"
def __repr__(self) -> str:
return f"MultipleChoice({list(self.choices)})"
def convert(
self, value: Any, param: Optional[click.Parameter] = None, ctx: Optional[click.Context] = None
) -> list[Any]:
if not value:
return []
if isinstance(value, str):
values = value.split(",")
elif isinstance(value, list):
values = value
else:
self.fail(f"{value!r} is not a supported value.", param, ctx)
chosen_values: list[Any] = []
for value in values:
chosen_values.append(super().convert(value, param, ctx))
return chosen_values
def shell_complete(self, ctx: click.Context, param: click.Parameter, incomplete: str) -> list[CompletionItem]:
"""
Complete choices that start with the incomplete value.
Parameters:
ctx: Invocation context for this command.
param: The parameter that is requesting completion.
incomplete: Value being completed. May be empty.
"""
incomplete = incomplete.rsplit(",")[-1]
return super(self).shell_complete(ctx, param, incomplete)
SEASON_RANGE = SeasonRange()
LANGUAGE_RANGE = LanguageRange()
QUALITY_LIST = QualityList()
# VIDEO_CODEC_CHOICE will be created dynamically when imported

View File

@ -0,0 +1,51 @@
import itertools
from typing import Any, Iterable, Iterator, Sequence, Tuple, Type, Union
def as_lists(*args: Any) -> Iterator[Any]:
"""Converts any input objects to list objects."""
for item in args:
yield item if isinstance(item, list) else [item]
def as_list(*args: Any) -> list:
"""
Convert any input objects to a single merged list object.
Example:
>>> as_list('foo', ['buzz', 'bizz'], 'bazz', 'bozz', ['bar'], ['bur'])
['foo', 'buzz', 'bizz', 'bazz', 'bozz', 'bar', 'bur']
"""
return list(itertools.chain.from_iterable(as_lists(*args)))
def flatten(items: Any, ignore_types: Union[Type, Tuple[Type, ...]] = str) -> Iterator:
"""
Flattens items recursively.
Example:
>>> list(flatten(["foo", [["bar", ["buzz", [""]], "bee"]]]))
['foo', 'bar', 'buzz', '', 'bee']
>>> list(flatten("foo"))
['foo']
>>> list(flatten({1}, set))
[{1}]
"""
if isinstance(items, (Iterable, Sequence)) and not isinstance(items, ignore_types):
for i in items:
yield from flatten(i, ignore_types)
else:
yield items
def merge_dict(source: dict, destination: dict) -> None:
"""Recursively merge Source into Destination in-place."""
if not source:
return
for key, value in source.items():
if isinstance(value, dict):
# get node or create one
node = destination.setdefault(key, {})
merge_dict(value, node)
else:
destination[key] = value

View File

@ -0,0 +1,30 @@
import logging
import os
import random
from datetime import datetime, timedelta
log = logging.getLogger("NF-ESN")
def chrome_esn_generator():
ESN_GEN = "".join(random.choice("0123456789ABCDEF") for _ in range(30))
esn_file = ".esn"
def gen_file():
with open(esn_file, "w") as file:
file.write(f"NFCDIE-03-{ESN_GEN}")
if not os.path.isfile(esn_file):
log.warning("Generating a new Chrome ESN")
gen_file()
file_datetime = datetime.fromtimestamp(os.path.getmtime(esn_file))
time_diff = datetime.now() - file_datetime
if time_diff > timedelta(hours=6):
log.warning("Old ESN detected, Generating a new Chrome ESN")
gen_file()
with open(esn_file, "r") as f:
esn = f.read()
return esn

View File

@ -0,0 +1,24 @@
import platform
def get_os_arch(name: str) -> str:
"""Builds a name-os-arch based on the input name, system, architecture."""
os_name = platform.system().lower()
os_arch = platform.machine().lower()
# Map platform.system() output to desired OS name
if os_name == "windows":
os_name = "win"
elif os_name == "darwin":
os_name = "osx"
else:
os_name = "linux"
# Map platform.machine() output to desired architecture
if os_arch in ["x86_64", "amd64"]:
os_arch = "x64"
elif os_arch == "arm64":
os_arch = "arm64"
# Construct the dependency name in the desired format using the input name
return f"{name}-{os_name}-{os_arch}"

View File

@ -0,0 +1,77 @@
import ssl
from typing import Optional
from requests.adapters import HTTPAdapter
class SSLCiphers(HTTPAdapter):
"""
Custom HTTP Adapter to change the TLS Cipher set and security requirements.
Security Level may optionally be provided. A level above 0 must be used at all times.
A list of Security Levels and their security is listed below. Usually 2 is used by default.
Do not set the Security level via @SECLEVEL in the cipher list.
Level 0:
Everything is permitted. This retains compatibility with previous versions of OpenSSL.
Level 1:
The security level corresponds to a minimum of 80 bits of security. Any parameters
offering below 80 bits of security are excluded. As a result RSA, DSA and DH keys
shorter than 1024 bits and ECC keys shorter than 160 bits are prohibited. All export
cipher suites are prohibited since they all offer less than 80 bits of security. SSL
version 2 is prohibited. Any cipher suite using MD5 for the MAC is also prohibited.
Level 2:
Security level set to 112 bits of security. As a result RSA, DSA and DH keys shorter
than 2048 bits and ECC keys shorter than 224 bits are prohibited. In addition to the
level 1 exclusions any cipher suite using RC4 is also prohibited. SSL version 3 is
also not allowed. Compression is disabled.
Level 3:
Security level set to 128 bits of security. As a result RSA, DSA and DH keys shorter
than 3072 bits and ECC keys shorter than 256 bits are prohibited. In addition to the
level 2 exclusions cipher suites not offering forward secrecy are prohibited. TLS
versions below 1.1 are not permitted. Session tickets are disabled.
Level 4:
Security level set to 192 bits of security. As a result RSA, DSA and DH keys shorter
than 7680 bits and ECC keys shorter than 384 bits are prohibited. Cipher suites using
SHA1 for the MAC are prohibited. TLS versions below 1.2 are not permitted.
Level 5:
Security level set to 256 bits of security. As a result RSA, DSA and DH keys shorter
than 15360 bits and ECC keys shorter than 512 bits are prohibited.
"""
def __init__(self, cipher_list: Optional[str] = None, security_level: int = 0, *args, **kwargs):
if cipher_list:
if not isinstance(cipher_list, str):
raise TypeError(f"Expected cipher_list to be a str, not {cipher_list!r}")
if "@SECLEVEL" in cipher_list:
raise ValueError("You must not specify the Security Level manually in the cipher list.")
if not isinstance(security_level, int):
raise TypeError(f"Expected security_level to be an int, not {security_level!r}")
if security_level not in range(6):
raise ValueError(f"The security_level must be a value between 0 and 5, not {security_level}")
if not cipher_list:
# cpython's default cipher list differs to Python-requests cipher list
cipher_list = "DEFAULT"
cipher_list += f":@SECLEVEL={security_level}"
ctx = ssl.create_default_context()
ctx.check_hostname = False # For some reason this is needed to avoid a verification error
ctx.set_ciphers(cipher_list)
self._ssl_context = ctx
super().__init__(*args, **kwargs)
def init_poolmanager(self, *args, **kwargs):
kwargs["ssl_context"] = self._ssl_context
return super().init_poolmanager(*args, **kwargs)
def proxy_manager_for(self, *args, **kwargs):
kwargs["ssl_context"] = self._ssl_context
return super().proxy_manager_for(*args, **kwargs)

View File

@ -0,0 +1,25 @@
import json
import subprocess
from pathlib import Path
from typing import Union
from unshackle.core import binaries
def ffprobe(uri: Union[bytes, Path]) -> dict:
"""Use ffprobe on the provided data to get stream information."""
if not binaries.FFProbe:
raise EnvironmentError('FFProbe executable "ffprobe" not found but is required.')
args = [binaries.FFProbe, "-v", "quiet", "-of", "json", "-show_streams"]
if isinstance(uri, Path):
args.extend(
["-f", "lavfi", "-i", "movie={}[out+subcc]".format(str(uri).replace("\\", "/").replace(":", "\\\\:"))]
)
elif isinstance(uri, bytes):
args.append("pipe:")
try:
ff = subprocess.run(args, input=uri if isinstance(uri, bytes) else None, check=True, capture_output=True)
except subprocess.CalledProcessError:
return {}
return json.loads(ff.stdout.decode("utf8"))

View File

@ -0,0 +1,651 @@
from __future__ import annotations
import logging
import re
import subprocess
import tempfile
from difflib import SequenceMatcher
from pathlib import Path
from typing import Optional, Tuple
from xml.sax.saxutils import escape
import requests
from requests.adapters import HTTPAdapter, Retry
from unshackle.core import binaries
from unshackle.core.config import config
from unshackle.core.titles.episode import Episode
from unshackle.core.titles.movie import Movie
from unshackle.core.titles.title import Title
STRIP_RE = re.compile(r"[^a-z0-9]+", re.I)
YEAR_RE = re.compile(r"\s*\(?[12][0-9]{3}\)?$")
HEADERS = {"User-Agent": "unshackle-tags/1.0"}
log = logging.getLogger("TAGS")
def _get_session() -> requests.Session:
"""Create a requests session with retry logic for network failures."""
session = requests.Session()
session.headers.update(HEADERS)
retry = Retry(
total=3, backoff_factor=1, status_forcelist=[429, 500, 502, 503, 504], allowed_methods=["GET", "POST"]
)
adapter = HTTPAdapter(max_retries=retry)
session.mount("https://", adapter)
session.mount("http://", adapter)
return session
def _api_key() -> Optional[str]:
return config.tmdb_api_key
def _simkl_client_id() -> Optional[str]:
return config.simkl_client_id
def _clean(s: str) -> str:
return STRIP_RE.sub("", s).lower()
def _strip_year(s: str) -> str:
return YEAR_RE.sub("", s).strip()
def fuzzy_match(a: str, b: str, threshold: float = 0.8) -> bool:
"""Return True if ``a`` and ``b`` are a close match."""
ratio = SequenceMatcher(None, _clean(a), _clean(b)).ratio()
return ratio >= threshold
def search_simkl(
title: str,
year: Optional[int],
kind: str,
title_cacher=None,
cache_title_id: Optional[str] = None,
cache_region: Optional[str] = None,
cache_account_hash: Optional[str] = None,
) -> Tuple[Optional[dict], Optional[str], Optional[int]]:
"""Search Simkl API for show information by filename."""
if title_cacher and cache_title_id:
cached_simkl = title_cacher.get_cached_simkl(cache_title_id, cache_region, cache_account_hash)
if cached_simkl:
log.debug("Using cached Simkl data")
if cached_simkl.get("type") == "episode" and "show" in cached_simkl:
show_info = cached_simkl["show"]
show_title = show_info.get("title")
tmdb_id = show_info.get("ids", {}).get("tmdbtv")
if tmdb_id:
tmdb_id = int(tmdb_id)
return cached_simkl, show_title, tmdb_id
elif cached_simkl.get("type") == "movie" and "movie" in cached_simkl:
movie_info = cached_simkl["movie"]
movie_title = movie_info.get("title")
ids = movie_info.get("ids", {})
tmdb_id = ids.get("tmdb") or ids.get("moviedb")
if tmdb_id:
tmdb_id = int(tmdb_id)
return cached_simkl, movie_title, tmdb_id
log.debug("Searching Simkl for %r (%s, %s)", title, kind, year)
client_id = _simkl_client_id()
if not client_id:
log.debug("No SIMKL client ID configured; skipping SIMKL search")
return None, None, None
# Construct appropriate filename based on type
filename = f"{title}"
if year:
filename = f"{title} {year}"
if kind == "tv":
filename += " S01E01.mkv"
else: # movie
filename += " 2160p.mkv"
try:
session = _get_session()
headers = {"simkl-api-key": client_id}
resp = session.post("https://api.simkl.com/search/file", json={"file": filename}, headers=headers, timeout=30)
resp.raise_for_status()
data = resp.json()
log.debug("Simkl API response received")
# Handle case where SIMKL returns empty list (no results)
if isinstance(data, list):
log.debug("Simkl returned list (no matches) for %r", filename)
return None, None, None
# Handle TV show responses
if data.get("type") == "episode" and "show" in data:
show_info = data["show"]
show_title = show_info.get("title")
show_year = show_info.get("year")
# Verify title matches and year if provided
if not fuzzy_match(show_title, title):
log.debug("Simkl title mismatch: searched %r, got %r", title, show_title)
return None, None, None
if year and show_year and abs(year - show_year) > 1: # Allow 1 year difference
log.debug("Simkl year mismatch: searched %d, got %d", year, show_year)
return None, None, None
if title_cacher and cache_title_id:
try:
title_cacher.cache_simkl(cache_title_id, data, cache_region, cache_account_hash)
except Exception as exc:
log.debug("Failed to cache Simkl data: %s", exc)
tmdb_id = show_info.get("ids", {}).get("tmdbtv")
if tmdb_id:
tmdb_id = int(tmdb_id)
log.debug("Simkl -> %s (TMDB ID %s)", show_title, tmdb_id)
return data, show_title, tmdb_id
elif data.get("type") == "movie" and "movie" in data:
movie_info = data["movie"]
movie_title = movie_info.get("title")
movie_year = movie_info.get("year")
if not fuzzy_match(movie_title, title):
log.debug("Simkl title mismatch: searched %r, got %r", title, movie_title)
return None, None, None
if year and movie_year and abs(year - movie_year) > 1: # Allow 1 year difference
log.debug("Simkl year mismatch: searched %d, got %d", year, movie_year)
return None, None, None
if title_cacher and cache_title_id:
try:
title_cacher.cache_simkl(cache_title_id, data, cache_region, cache_account_hash)
except Exception as exc:
log.debug("Failed to cache Simkl data: %s", exc)
ids = movie_info.get("ids", {})
tmdb_id = ids.get("tmdb") or ids.get("moviedb")
if tmdb_id:
tmdb_id = int(tmdb_id)
log.debug("Simkl -> %s (TMDB ID %s)", movie_title, tmdb_id)
return data, movie_title, tmdb_id
except (requests.RequestException, ValueError, KeyError) as exc:
log.debug("Simkl search failed: %s", exc)
return None, None, None
def search_show_info(
title: str,
year: Optional[int],
kind: str,
title_cacher=None,
cache_title_id: Optional[str] = None,
cache_region: Optional[str] = None,
cache_account_hash: Optional[str] = None,
) -> Tuple[Optional[int], Optional[str], Optional[str]]:
"""Search for show information, trying Simkl first, then TMDB fallback. Returns (tmdb_id, title, source)."""
simkl_data, simkl_title, simkl_tmdb_id = search_simkl(
title, year, kind, title_cacher, cache_title_id, cache_region, cache_account_hash
)
if simkl_data and simkl_title and fuzzy_match(simkl_title, title):
return simkl_tmdb_id, simkl_title, "simkl"
tmdb_id, tmdb_title = search_tmdb(title, year, kind, title_cacher, cache_title_id, cache_region, cache_account_hash)
return tmdb_id, tmdb_title, "tmdb"
def _fetch_tmdb_detail(tmdb_id: int, kind: str) -> Optional[dict]:
"""Fetch full TMDB detail response for caching."""
api_key = _api_key()
if not api_key:
return None
try:
session = _get_session()
r = session.get(
f"https://api.themoviedb.org/3/{kind}/{tmdb_id}",
params={"api_key": api_key},
timeout=30,
)
r.raise_for_status()
return r.json()
except requests.RequestException as exc:
log.debug("Failed to fetch TMDB detail: %s", exc)
return None
def _fetch_tmdb_external_ids(tmdb_id: int, kind: str) -> Optional[dict]:
"""Fetch full TMDB external_ids response for caching."""
api_key = _api_key()
if not api_key:
return None
try:
session = _get_session()
r = session.get(
f"https://api.themoviedb.org/3/{kind}/{tmdb_id}/external_ids",
params={"api_key": api_key},
timeout=30,
)
r.raise_for_status()
return r.json()
except requests.RequestException as exc:
log.debug("Failed to fetch TMDB external IDs: %s", exc)
return None
def search_tmdb(
title: str,
year: Optional[int],
kind: str,
title_cacher=None,
cache_title_id: Optional[str] = None,
cache_region: Optional[str] = None,
cache_account_hash: Optional[str] = None,
) -> Tuple[Optional[int], Optional[str]]:
if title_cacher and cache_title_id:
cached_tmdb = title_cacher.get_cached_tmdb(cache_title_id, kind, cache_region, cache_account_hash)
if cached_tmdb and cached_tmdb.get("detail"):
detail = cached_tmdb["detail"]
tmdb_id = detail.get("id")
tmdb_title = detail.get("title") or detail.get("name")
log.debug("Using cached TMDB data: %r (ID %s)", tmdb_title, tmdb_id)
return tmdb_id, tmdb_title
api_key = _api_key()
if not api_key:
return None, None
search_title = _strip_year(title)
log.debug("Searching TMDB for %r (%s, %s)", search_title, kind, year)
params = {"api_key": api_key, "query": search_title}
if year is not None:
params["year" if kind == "movie" else "first_air_date_year"] = year
try:
session = _get_session()
r = session.get(
f"https://api.themoviedb.org/3/search/{kind}",
params=params,
timeout=30,
)
r.raise_for_status()
js = r.json()
results = js.get("results") or []
log.debug("TMDB returned %d results", len(results))
if not results:
return None, None
except requests.RequestException as exc:
log.warning("Failed to search TMDB for %s: %s", title, exc)
return None, None
best_ratio = 0.0
best_id: Optional[int] = None
best_title: Optional[str] = None
for result in results:
candidates = [
result.get("title"),
result.get("name"),
result.get("original_title"),
result.get("original_name"),
]
candidates = [c for c in candidates if c] # Filter out None/empty values
if not candidates:
continue
# Find the best matching candidate from all available titles
for candidate in candidates:
ratio = SequenceMatcher(None, _clean(search_title), _clean(candidate)).ratio()
if ratio > best_ratio:
best_ratio = ratio
best_id = result.get("id")
best_title = candidate
log.debug(
"Best candidate ratio %.2f for %r (ID %s)",
best_ratio,
best_title,
best_id,
)
if best_id is not None:
if title_cacher and cache_title_id:
try:
detail_response = _fetch_tmdb_detail(best_id, kind)
external_ids_response = _fetch_tmdb_external_ids(best_id, kind)
if detail_response and external_ids_response:
title_cacher.cache_tmdb(
cache_title_id, detail_response, external_ids_response, kind, cache_region, cache_account_hash
)
except Exception as exc:
log.debug("Failed to cache TMDB data: %s", exc)
return best_id, best_title
first = results[0]
return first.get("id"), first.get("title") or first.get("name")
def get_title(
tmdb_id: int,
kind: str,
title_cacher=None,
cache_title_id: Optional[str] = None,
cache_region: Optional[str] = None,
cache_account_hash: Optional[str] = None,
) -> Optional[str]:
"""Fetch the name/title of a TMDB entry by ID."""
if title_cacher and cache_title_id:
cached_tmdb = title_cacher.get_cached_tmdb(cache_title_id, kind, cache_region, cache_account_hash)
if cached_tmdb and cached_tmdb.get("detail"):
detail = cached_tmdb["detail"]
tmdb_title = detail.get("title") or detail.get("name")
log.debug("Using cached TMDB title: %r", tmdb_title)
return tmdb_title
api_key = _api_key()
if not api_key:
return None
try:
session = _get_session()
r = session.get(
f"https://api.themoviedb.org/3/{kind}/{tmdb_id}",
params={"api_key": api_key},
timeout=30,
)
r.raise_for_status()
js = r.json()
if title_cacher and cache_title_id:
try:
external_ids_response = _fetch_tmdb_external_ids(tmdb_id, kind)
if external_ids_response:
title_cacher.cache_tmdb(
cache_title_id, js, external_ids_response, kind, cache_region, cache_account_hash
)
except Exception as exc:
log.debug("Failed to cache TMDB data: %s", exc)
return js.get("title") or js.get("name")
except requests.RequestException as exc:
log.debug("Failed to fetch TMDB title: %s", exc)
return None
def get_year(
tmdb_id: int,
kind: str,
title_cacher=None,
cache_title_id: Optional[str] = None,
cache_region: Optional[str] = None,
cache_account_hash: Optional[str] = None,
) -> Optional[int]:
"""Fetch the release year of a TMDB entry by ID."""
if title_cacher and cache_title_id:
cached_tmdb = title_cacher.get_cached_tmdb(cache_title_id, kind, cache_region, cache_account_hash)
if cached_tmdb and cached_tmdb.get("detail"):
detail = cached_tmdb["detail"]
date = detail.get("release_date") or detail.get("first_air_date")
if date and len(date) >= 4 and date[:4].isdigit():
year = int(date[:4])
log.debug("Using cached TMDB year: %d", year)
return year
api_key = _api_key()
if not api_key:
return None
try:
session = _get_session()
r = session.get(
f"https://api.themoviedb.org/3/{kind}/{tmdb_id}",
params={"api_key": api_key},
timeout=30,
)
r.raise_for_status()
js = r.json()
if title_cacher and cache_title_id:
try:
external_ids_response = _fetch_tmdb_external_ids(tmdb_id, kind)
if external_ids_response:
title_cacher.cache_tmdb(
cache_title_id, js, external_ids_response, kind, cache_region, cache_account_hash
)
except Exception as exc:
log.debug("Failed to cache TMDB data: %s", exc)
date = js.get("release_date") or js.get("first_air_date")
if date and len(date) >= 4 and date[:4].isdigit():
return int(date[:4])
return None
except requests.RequestException as exc:
log.debug("Failed to fetch TMDB year: %s", exc)
return None
def external_ids(
tmdb_id: int,
kind: str,
title_cacher=None,
cache_title_id: Optional[str] = None,
cache_region: Optional[str] = None,
cache_account_hash: Optional[str] = None,
) -> dict:
if title_cacher and cache_title_id:
cached_tmdb = title_cacher.get_cached_tmdb(cache_title_id, kind, cache_region, cache_account_hash)
if cached_tmdb and cached_tmdb.get("external_ids"):
log.debug("Using cached TMDB external IDs")
return cached_tmdb["external_ids"]
api_key = _api_key()
if not api_key:
return {}
url = f"https://api.themoviedb.org/3/{kind}/{tmdb_id}/external_ids"
log.debug("Fetching external IDs for %s %s", kind, tmdb_id)
try:
session = _get_session()
r = session.get(
url,
params={"api_key": api_key},
timeout=30,
)
r.raise_for_status()
js = r.json()
log.debug("External IDs response: %s", js)
if title_cacher and cache_title_id:
try:
detail_response = _fetch_tmdb_detail(tmdb_id, kind)
if detail_response:
title_cacher.cache_tmdb(cache_title_id, detail_response, js, kind, cache_region, cache_account_hash)
except Exception as exc:
log.debug("Failed to cache TMDB data: %s", exc)
return js
except requests.RequestException as exc:
log.warning("Failed to fetch external IDs for %s %s: %s", kind, tmdb_id, exc)
return {}
def apply_tags(path: Path, tags: dict[str, str]) -> None:
if not tags:
return
if not binaries.Mkvpropedit:
log.debug("mkvpropedit not found on PATH; skipping tags")
return
log.debug("Applying tags to %s: %s", path, tags)
xml_lines = ['<?xml version="1.0" encoding="UTF-8"?>', "<Tags>", " <Tag>", " <Targets/>"]
for name, value in tags.items():
xml_lines.append(f" <Simple><Name>{escape(name)}</Name><String>{escape(value)}</String></Simple>")
xml_lines.extend([" </Tag>", "</Tags>"])
with tempfile.NamedTemporaryFile("w", suffix=".xml", delete=False, encoding="utf-8") as f:
f.write("\n".join(xml_lines))
tmp_path = Path(f.name)
try:
subprocess.run(
[str(binaries.Mkvpropedit), str(path), "--tags", f"global:{tmp_path}"],
check=False,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
log.debug("Tags applied via mkvpropedit")
finally:
tmp_path.unlink(missing_ok=True)
def tag_file(path: Path, title: Title, tmdb_id: Optional[int] | None = None) -> None:
log.debug("Tagging file %s with title %r", path, title)
standard_tags: dict[str, str] = {}
custom_tags: dict[str, str] = {}
if config.tag and config.tag_group_name:
custom_tags["Group"] = config.tag
description = getattr(title, "description", None)
if description:
if len(description) > 255:
truncated = description[:255]
if " " in truncated:
truncated = truncated.rsplit(" ", 1)[0]
description = truncated + "..."
custom_tags["Description"] = description
if isinstance(title, Movie):
kind = "movie"
name = title.name
year = title.year
elif isinstance(title, Episode):
kind = "tv"
name = title.title
year = title.year
else:
apply_tags(path, custom_tags)
return
if config.tag_imdb_tmdb:
# Check if we have any API keys available for metadata lookup
api_key = _api_key()
simkl_client = _simkl_client_id()
if not api_key and not simkl_client:
log.debug("No TMDB API key or Simkl client ID configured; skipping IMDB/TMDB tag lookup")
apply_tags(path, custom_tags)
return
else:
# If tmdb_id is provided (via --tmdb), skip Simkl and use TMDB directly
if tmdb_id is not None:
log.debug("Using provided TMDB ID %s for tags", tmdb_id)
else:
# Try Simkl first for automatic lookup (only if client ID is available)
if simkl_client:
simkl_data, simkl_title, simkl_tmdb_id = search_simkl(name, year, kind)
if simkl_data and simkl_title and fuzzy_match(simkl_title, name):
log.debug("Using Simkl data for tags")
if simkl_tmdb_id:
tmdb_id = simkl_tmdb_id
# Handle TV show data from Simkl
if simkl_data.get("type") == "episode" and "show" in simkl_data:
show_ids = simkl_data.get("show", {}).get("ids", {})
if show_ids.get("imdb"):
standard_tags["IMDB"] = show_ids["imdb"]
if show_ids.get("tvdb"):
standard_tags["TVDB2"] = f"series/{show_ids['tvdb']}"
if show_ids.get("tmdbtv"):
standard_tags["TMDB"] = f"tv/{show_ids['tmdbtv']}"
# Handle movie data from Simkl
elif simkl_data.get("type") == "movie" and "movie" in simkl_data:
movie_ids = simkl_data.get("movie", {}).get("ids", {})
if movie_ids.get("imdb"):
standard_tags["IMDB"] = movie_ids["imdb"]
if movie_ids.get("tvdb"):
standard_tags["TVDB2"] = f"movies/{movie_ids['tvdb']}"
if movie_ids.get("tmdb"):
standard_tags["TMDB"] = f"movie/{movie_ids['tmdb']}"
# Use TMDB API for additional metadata (either from provided ID or Simkl lookup)
if api_key:
tmdb_title: Optional[str] = None
if tmdb_id is None:
tmdb_id, tmdb_title = search_tmdb(name, year, kind)
log.debug("TMDB search result: %r (ID %s)", tmdb_title, tmdb_id)
if not tmdb_id or not tmdb_title or not fuzzy_match(tmdb_title, name):
log.debug("TMDB search did not match; skipping external ID lookup")
else:
prefix = "movie" if kind == "movie" else "tv"
standard_tags["TMDB"] = f"{prefix}/{tmdb_id}"
try:
ids = external_ids(tmdb_id, kind)
except requests.RequestException as exc:
log.debug("Failed to fetch external IDs: %s", exc)
ids = {}
else:
log.debug("External IDs found: %s", ids)
imdb_id = ids.get("imdb_id")
if imdb_id:
standard_tags["IMDB"] = imdb_id
tvdb_id = ids.get("tvdb_id")
if tvdb_id:
if kind == "movie":
standard_tags["TVDB2"] = f"movies/{tvdb_id}"
else:
standard_tags["TVDB2"] = f"series/{tvdb_id}"
elif tmdb_id is not None:
# tmdb_id was provided or found via Simkl
prefix = "movie" if kind == "movie" else "tv"
standard_tags["TMDB"] = f"{prefix}/{tmdb_id}"
try:
ids = external_ids(tmdb_id, kind)
except requests.RequestException as exc:
log.debug("Failed to fetch external IDs: %s", exc)
ids = {}
else:
log.debug("External IDs found: %s", ids)
imdb_id = ids.get("imdb_id")
if imdb_id:
standard_tags["IMDB"] = imdb_id
tvdb_id = ids.get("tvdb_id")
if tvdb_id:
if kind == "movie":
standard_tags["TVDB2"] = f"movies/{tvdb_id}"
else:
standard_tags["TVDB2"] = f"series/{tvdb_id}"
else:
log.debug("No TMDB API key configured; skipping TMDB external ID lookup")
merged_tags = {
**custom_tags,
**standard_tags,
}
apply_tags(path, merged_tags)
__all__ = [
"search_simkl",
"search_show_info",
"search_tmdb",
"get_title",
"get_year",
"external_ids",
"tag_file",
"fuzzy_match",
]

View File

@ -0,0 +1,212 @@
import re
import sys
import typing
from typing import Optional
import pysubs2
from pycaption import Caption, CaptionList, CaptionNode, CaptionReadError, WebVTTReader, WebVTTWriter
from unshackle.core.config import config
class CaptionListExt(CaptionList):
@typing.no_type_check
def __init__(self, iterable=None, layout_info=None):
self.first_segment_mpegts = 0
super().__init__(iterable, layout_info)
class CaptionExt(Caption):
@typing.no_type_check
def __init__(self, start, end, nodes, style=None, layout_info=None, segment_index=0, mpegts=0, cue_time=0.0):
style = style or {}
self.segment_index: int = segment_index
self.mpegts: float = mpegts
self.cue_time: float = cue_time
super().__init__(start, end, nodes, style, layout_info)
class WebVTTReaderExt(WebVTTReader):
# HLS extension support <https://datatracker.ietf.org/doc/html/rfc8216#section-3.5>
RE_TIMESTAMP_MAP = re.compile(r"X-TIMESTAMP-MAP.*")
RE_MPEGTS = re.compile(r"MPEGTS:(\d+)")
RE_LOCAL = re.compile(r"LOCAL:((?:(\d{1,}):)?(\d{2}):(\d{2})\.(\d{3}))")
def _parse(self, lines: list[str]) -> CaptionList:
captions = CaptionListExt()
start = None
end = None
nodes: list[CaptionNode] = []
layout_info = None
found_timing = False
segment_index = -1
mpegts = 0
cue_time = 0.0
# The first segment MPEGTS is needed to calculate the rest. It is possible that
# the first segment contains no cue and is ignored by pycaption, this acts as a fallback.
captions.first_segment_mpegts = 0
for i, line in enumerate(lines):
if "-->" in line:
found_timing = True
timing_line = i
last_start_time = captions[-1].start if captions else 0
try:
start, end, layout_info = self._parse_timing_line(line, last_start_time)
except CaptionReadError as e:
new_msg = f"{e.args[0]} (line {timing_line})"
tb = sys.exc_info()[2]
raise type(e)(new_msg).with_traceback(tb) from None
elif "" == line:
if found_timing and nodes:
found_timing = False
caption = CaptionExt(
start,
end,
nodes,
layout_info=layout_info,
segment_index=segment_index,
mpegts=mpegts,
cue_time=cue_time,
)
captions.append(caption)
nodes = []
elif "WEBVTT" in line:
# Merged segmented VTT doesn't have index information, track manually.
segment_index += 1
mpegts = 0
cue_time = 0.0
elif m := self.RE_TIMESTAMP_MAP.match(line):
if r := self.RE_MPEGTS.search(m.group()):
mpegts = int(r.group(1))
cue_time = self._parse_local(m.group())
# Early assignment in case the first segment contains no cue.
if segment_index == 0:
captions.first_segment_mpegts = mpegts
else:
if found_timing:
if nodes:
nodes.append(CaptionNode.create_break())
nodes.append(CaptionNode.create_text(self._decode(line)))
else:
# it's a comment or some metadata; ignore it
pass
# Add a last caption if there are remaining nodes
if nodes:
caption = CaptionExt(start, end, nodes, layout_info=layout_info, segment_index=segment_index, mpegts=mpegts)
captions.append(caption)
return captions
@staticmethod
def _parse_local(string: str) -> float:
"""
Parse WebVTT LOCAL time and convert it to seconds.
"""
m = WebVTTReaderExt.RE_LOCAL.search(string)
if not m:
return 0
parsed = m.groups()
if not parsed:
return 0
hours = int(parsed[1])
minutes = int(parsed[2])
seconds = int(parsed[3])
milliseconds = int(parsed[4])
return (milliseconds / 1000) + seconds + (minutes * 60) + (hours * 3600)
def merge_segmented_webvtt(vtt_raw: str, segment_durations: Optional[list[int]] = None, timescale: int = 1) -> str:
"""
Merge Segmented WebVTT data.
Parameters:
vtt_raw: The concatenated WebVTT files to merge. All WebVTT headers must be
appropriately spaced apart, or it may produce unwanted effects like
considering headers as captions, timestamp lines, etc.
segment_durations: A list of each segment's duration. If not provided it will try
to get it from the X-TIMESTAMP-MAP headers, specifically the MPEGTS number.
timescale: The number of time units per second.
This parses the X-TIMESTAMP-MAP data to compute new absolute timestamps, replacing
the old start and end timestamp values. All X-TIMESTAMP-MAP header information will
be removed from the output as they are no longer of concern. Consider this function
the opposite of a WebVTT Segmenter, a WebVTT Joiner of sorts.
Algorithm borrowed from N_m3u8DL-RE and shaka-player.
"""
MPEG_TIMESCALE = 90_000
# Check config for conversion method preference
conversion_method = config.subtitle.get("conversion_method", "auto")
use_pysubs2 = conversion_method in ("pysubs2", "auto")
if use_pysubs2:
# Try using pysubs2 first for more lenient parsing
try:
# Use pysubs2 to parse and normalize the VTT
subs = pysubs2.SSAFile.from_string(vtt_raw)
# Convert back to WebVTT string for pycaption processing
normalized_vtt = subs.to_string("vtt")
vtt = WebVTTReaderExt().read(normalized_vtt)
except Exception:
# Fall back to direct pycaption parsing
vtt = WebVTTReaderExt().read(vtt_raw)
else:
# Use pycaption directly
vtt = WebVTTReaderExt().read(vtt_raw)
for lang in vtt.get_languages():
prev_caption = None
duplicate_index: list[int] = []
captions = vtt.get_captions(lang)
if captions[0].segment_index == 0:
first_segment_mpegts = captions[0].mpegts
else:
first_segment_mpegts = segment_durations[0] if segment_durations else captions.first_segment_mpegts
caption: CaptionExt
for i, caption in enumerate(captions):
# DASH WebVTT doesn't have MPEGTS timestamp like HLS. Instead,
# calculate the timestamp from SegmentTemplate/SegmentList duration.
likely_dash = first_segment_mpegts == 0 and caption.mpegts == 0
if likely_dash and segment_durations:
duration = segment_durations[caption.segment_index]
caption.mpegts = MPEG_TIMESCALE * (duration / timescale)
if caption.mpegts == 0:
continue
# Commeted to fix DSNP subs being out of sync and mistimed.
# seconds = (caption.mpegts - first_segment_mpegts) / MPEG_TIMESCALE - caption.cue_time
# offset = seconds * 1_000_000 # pycaption use microseconds
# if caption.start < offset:
# caption.start += offset
# caption.end += offset
# If the difference between current and previous captions is <=1ms
# and the payload is equal then splice.
if (
prev_caption
and not caption.is_empty()
and (caption.start - prev_caption.end) <= 1000 # 1ms in microseconds
and caption.get_text() == prev_caption.get_text()
):
prev_caption.end = caption.end
duplicate_index.append(i)
prev_caption = caption
# Remove duplicate
captions[:] = [c for c_index, c in enumerate(captions) if c_index not in set(duplicate_index)]
return WebVTTWriter().write(vtt)

View File

@ -0,0 +1,24 @@
from typing import Union
from lxml import etree
from lxml.etree import ElementTree
def load_xml(xml: Union[str, bytes]) -> ElementTree:
"""Safely parse XML data to an ElementTree, without namespaces in tags."""
if not isinstance(xml, bytes):
xml = xml.encode("utf8")
root = etree.fromstring(xml)
for elem in root.getiterator():
if not hasattr(elem.tag, "find"):
# e.g. comment elements
continue
elem.tag = etree.QName(elem).localname
for name, value in elem.attrib.items():
local_name = etree.QName(name).localname
if local_name == name:
continue
del elem.attrib[name]
elem.attrib[local_name] = value
etree.cleanup_namespaces(root)
return root

49
unshackle/core/vault.py Normal file
View File

@ -0,0 +1,49 @@
from abc import ABCMeta, abstractmethod
from typing import Iterator, Optional, Union
from uuid import UUID
class Vault(metaclass=ABCMeta):
def __init__(self, name: str, no_push: bool = False):
self.name = name
self.no_push = no_push
def __str__(self) -> str:
return f"{self.name} {type(self).__name__}"
@abstractmethod
def get_key(self, kid: Union[UUID, str], service: str) -> Optional[str]:
"""
Get Key from Vault by KID (Key ID) and Service.
It does not get Key by PSSH as the PSSH can be different depending on it's implementation,
or even how it was crafted. Some PSSH values may also actually be a CENC Header rather
than a PSSH MP4 Box too, which makes the value even more confusingly different.
However, the KID never changes unless the video file itself has changed too, meaning the
key for the presumed-matching KID wouldn't work, further proving matching by KID is
superior.
"""
@abstractmethod
def get_keys(self, service: str) -> Iterator[tuple[str, str]]:
"""Get All Keys from Vault by Service."""
@abstractmethod
def add_key(self, service: str, kid: Union[UUID, str], key: str) -> bool:
"""Add KID:KEY to the Vault."""
@abstractmethod
def add_keys(self, service: str, kid_keys: dict[Union[UUID, str], str]) -> int:
"""
Add Multiple Content Keys with Key IDs for Service to the Vault.
Pre-existing Content Keys are ignored/skipped.
Raises PermissionError if the user has no permission to create the table.
"""
@abstractmethod
def get_services(self) -> Iterator[str]:
"""Get a list of Service Tags from Vault."""
__all__ = ("Vault",)

85
unshackle/core/vaults.py Normal file
View File

@ -0,0 +1,85 @@
from typing import Any, Iterator, Optional, Union
from uuid import UUID
from unshackle.core.config import config
from unshackle.core.utilities import import_module_by_path
from unshackle.core.vault import Vault
_VAULTS = sorted(
(path for path in config.directories.vaults.glob("*.py") if path.stem.lower() != "__init__"), key=lambda x: x.stem
)
_MODULES = {path.stem: getattr(import_module_by_path(path), path.stem) for path in _VAULTS}
class Vaults:
"""Keeps hold of Key Vaults with convenience functions, e.g. searching all vaults."""
def __init__(self, service: Optional[str] = None):
self.service = service or ""
self.vaults = []
def __iter__(self) -> Iterator[Vault]:
return iter(self.vaults)
def __len__(self) -> int:
return len(self.vaults)
def load(self, type_: str, **kwargs: Any) -> bool:
"""Load a Vault into the vaults list. Returns True if successful, False otherwise."""
module = _MODULES.get(type_)
if not module:
raise ValueError(f"Unable to find vault command by the name '{type_}'.")
try:
vault = module(**kwargs)
self.vaults.append(vault)
return True
except Exception:
return False
def load_critical(self, type_: str, **kwargs: Any) -> None:
"""Load a critical Vault that must succeed or raise an exception."""
module = _MODULES.get(type_)
if not module:
raise ValueError(f"Unable to find vault command by the name '{type_}'.")
vault = module(**kwargs)
self.vaults.append(vault)
def get_key(self, kid: Union[UUID, str]) -> tuple[Optional[str], Optional[Vault]]:
"""Get Key from the first Vault it can by KID (Key ID) and Service."""
for vault in self.vaults:
key = vault.get_key(kid, self.service)
if key and key.count("0") != len(key):
return key, vault
return None, None
def add_key(self, kid: Union[UUID, str], key: str, excluding: Optional[Vault] = None) -> int:
"""Add a KID:KEY to all Vaults, optionally with an exclusion."""
success = 0
for vault in self.vaults:
if vault != excluding and not vault.no_push:
try:
success += vault.add_key(self.service, kid, key)
except (PermissionError, NotImplementedError):
pass
return success
def add_keys(self, kid_keys: dict[Union[UUID, str], str]) -> int:
"""
Add multiple KID:KEYs to all Vaults. Duplicate Content Keys are skipped.
PermissionErrors when the user cannot create Tables are absorbed and ignored.
Vaults with no_push=True are skipped.
"""
success = 0
for vault in self.vaults:
if not vault.no_push:
try:
# Count each vault that successfully processes the keys (whether new or existing)
vault.add_keys(self.service, kid_keys)
success += 1
except (PermissionError, NotImplementedError):
pass
return success
__all__ = ("Vaults",)

View File

@ -0,0 +1,327 @@
import base64
import hashlib
import json
import re
from collections.abc import Generator
from datetime import datetime
from http.cookiejar import CookieJar
from typing import Optional, Union
import click
from langcodes import Language
from unshackle.core.constants import AnyTrack
from unshackle.core.credential import Credential
from unshackle.core.manifests import DASH
from unshackle.core.search_result import SearchResult
from unshackle.core.service import Service
from unshackle.core.titles import Episode, Movie, Movies, Series, Title_T, Titles_T
from unshackle.core.tracks import Chapter, Subtitle, Tracks, Video
class EXAMPLE(Service):
"""
Service code for domain.com
Version: 1.0.0
Authorization: Cookies
Security: FHD@L3
Use full URL (for example - https://domain.com/details/20914) or title ID (for example - 20914).
"""
TITLE_RE = r"^(?:https?://?domain\.com/details/)?(?P<title_id>[^/]+)"
GEOFENCE = ("US", "UK")
NO_SUBTITLES = True
@staticmethod
@click.command(name="EXAMPLE", short_help="https://domain.com")
@click.argument("title", type=str)
@click.option("-m", "--movie", is_flag=True, default=False, help="Specify if it's a movie")
@click.option("-d", "--device", type=str, default="android_tv", help="Select device from the config file")
@click.pass_context
def cli(ctx, **kwargs):
return EXAMPLE(ctx, **kwargs)
def __init__(self, ctx, title, movie, device):
super().__init__(ctx)
self.title = title
self.movie = movie
self.device = device
self.cdm = ctx.obj.cdm
# Get range parameter for HDR support
range_param = ctx.parent.params.get("range_")
self.range = range_param[0].name if range_param else "SDR"
if self.config is None:
raise Exception("Config is missing!")
else:
profile_name = ctx.parent.params.get("profile")
if profile_name is None:
profile_name = "default"
self.profile = profile_name
def authenticate(self, cookies: Optional[CookieJar] = None, credential: Optional[Credential] = None) -> None:
super().authenticate(cookies, credential)
if not cookies:
raise EnvironmentError("Service requires Cookies for Authentication.")
jwt_token = next((cookie.value for cookie in cookies if cookie.name == "streamco_token"), None)
payload = json.loads(base64.urlsafe_b64decode(jwt_token.split(".")[1] + "==").decode("utf-8"))
profile_id = payload.get("profileId", None)
self.session.headers.update({"user-agent": self.config["client"][self.device]["user_agent"]})
cache = self.cache.get(f"tokens_{self.device}_{self.profile}")
if cache:
if cache.data["expires_in"] > int(datetime.now().timestamp()):
self.log.info("Using cached tokens")
else:
self.log.info("Refreshing tokens")
refresh = self.session.post(
url=self.config["endpoints"]["refresh"], data={"refresh_token": cache.data["refresh_data"]}
).json()
cache.set(data=refresh)
else:
self.log.info("Retrieving new tokens")
token = self.session.post(
url=self.config["endpoints"]["login"],
data={
"token": jwt_token,
"profileId": profile_id,
},
).json()
cache.set(data=token)
self.token = cache.data["token"]
self.user_id = cache.data["userId"]
def search(self) -> Generator[SearchResult, None, None]:
search = self.session.get(
url=self.config["endpoints"]["search"], params={"q": self.title, "token": self.token}
).json()
for result in search["entries"]:
yield SearchResult(
id_=result["id"],
title=result["title"],
label="SERIES" if result["programType"] == "series" else "MOVIE",
url=result["url"],
)
def get_titles(self) -> Titles_T:
self.title = re.match(self.TITLE_RE, self.title).group(1)
metadata = self.session.get(
url=self.config["endpoints"]["metadata"].format(title_id=self.title), params={"token": self.token}
).json()
if metadata["programType"] == "movie":
self.movie = True
if self.movie:
return Movies(
[
Movie(
id_=metadata["id"],
service=self.__class__,
name=metadata["title"],
description=metadata["description"],
year=metadata["releaseYear"] if metadata["releaseYear"] > 0 else None,
language=Language.find(metadata["languages"][0]),
data=metadata,
)
]
)
else:
episodes = []
for season in metadata["seasons"]:
if "Trailers" not in season["title"]:
season_data = self.session.get(url=season["url"], params={"token": self.token}).json()
for episode in season_data["entries"]:
episodes.append(
Episode(
id_=episode["id"],
service=self.__class__,
title=metadata["title"],
season=episode["season"],
number=episode["episode"],
name=episode["title"],
description=episode["description"],
year=metadata["releaseYear"] if metadata["releaseYear"] > 0 else None,
language=Language.find(metadata["languages"][0]),
data=episode,
)
)
return Series(episodes)
def get_tracks(self, title: Title_T) -> Tracks:
# Handle HYBRID mode by fetching both HDR10 and DV tracks separately
if self.range == "HYBRID" and self.cdm.security_level != 3:
tracks = Tracks()
# Get HDR10 tracks
hdr10_tracks = self._get_tracks_for_range(title, "HDR10")
tracks.add(hdr10_tracks, warn_only=True)
# Get DV tracks
dv_tracks = self._get_tracks_for_range(title, "DV")
tracks.add(dv_tracks, warn_only=True)
return tracks
else:
# Normal single-range behavior
return self._get_tracks_for_range(title, self.range)
def _get_tracks_for_range(self, title: Title_T, range_override: str = None) -> Tracks:
# Use range_override if provided, otherwise use self.range
current_range = range_override if range_override else self.range
# Build API request parameters
params = {
"token": self.token,
"guid": title.id,
}
data = {
"type": self.config["client"][self.device]["type"],
}
# Add range-specific parameters
if current_range == "HDR10":
data["video_format"] = "hdr10"
elif current_range == "DV":
data["video_format"] = "dolby_vision"
else:
data["video_format"] = "sdr"
# Only request high-quality HDR content with L1 CDM
if current_range in ("HDR10", "DV") and self.cdm.security_level == 3:
# L3 CDM - skip HDR content
return Tracks()
streams = self.session.post(
url=self.config["endpoints"]["streams"],
params=params,
data=data,
).json()["media"]
self.license = {
"url": streams["drm"]["url"],
"data": streams["drm"]["data"],
"session": streams["drm"]["session"],
}
manifest_url = streams["url"].split("?")[0]
self.log.debug(f"Manifest URL: {manifest_url}")
tracks = DASH.from_url(url=manifest_url, session=self.session).to_tracks(language=title.language)
# Set range attributes on video tracks
for video in tracks.videos:
if current_range == "HDR10":
video.range = Video.Range.HDR10
elif current_range == "DV":
video.range = Video.Range.DV
else:
video.range = Video.Range.SDR
# Remove DRM-free ("clear") audio tracks
tracks.audio = [
track for track in tracks.audio if "clear" not in track.data["dash"]["representation"].get("id")
]
for track in tracks.audio:
if track.channels == 6.0:
track.channels = 5.1
track_label = track.data["dash"]["adaptation_set"].get("label")
if track_label and "Audio Description" in track_label:
track.descriptive = True
tracks.subtitles.clear()
if streams.get("captions"):
for subtitle in streams["captions"]:
tracks.add(
Subtitle(
id_=hashlib.md5(subtitle["url"].encode()).hexdigest()[0:6],
url=subtitle["url"],
codec=Subtitle.Codec.from_mime("vtt"),
language=Language.get(subtitle["language"]),
# cc=True if '(cc)' in subtitle['name'] else False,
sdh=True,
)
)
if not self.movie:
title.data["chapters"] = self.session.get(
url=self.config["endpoints"]["metadata"].format(title_id=title.id), params={"token": self.token}
).json()["chapters"]
return tracks
def get_chapters(self, title: Title_T) -> list[Chapter]:
chapters = []
if title.data.get("chapters", []):
for chapter in title.data["chapters"]:
if chapter["name"] == "Intro":
chapters.append(Chapter(timestamp=chapter["start"], name="Opening"))
chapters.append(Chapter(timestamp=chapter["end"]))
if chapter["name"] == "Credits":
chapters.append(Chapter(timestamp=chapter["start"], name="Credits"))
return chapters
def get_widevine_service_certificate(self, **_: any) -> str:
"""Return the Widevine service certificate from config, if available."""
return self.config.get("certificate")
def get_playready_license(self, *, challenge: bytes, title: Title_T, track: AnyTrack) -> Optional[bytes]:
"""Retrieve a PlayReady license for a given track."""
license_url = self.config["endpoints"].get("playready_license")
if not license_url:
raise ValueError("PlayReady license endpoint not configured")
response = self.session.post(
url=license_url,
data=challenge,
headers={
"user-agent": self.config["client"][self.device]["license_user_agent"],
},
)
response.raise_for_status()
return response.content
def get_widevine_license(self, *, challenge: bytes, title: Title_T, track: AnyTrack) -> Optional[Union[bytes, str]]:
license_url = self.license.get("url") or self.config["endpoints"].get("widevine_license")
if not license_url:
raise ValueError("Widevine license endpoint not configured")
response = self.session.post(
url=license_url,
data=challenge,
params={
"session": self.license.get("session"),
"userId": self.user_id,
},
headers={
"dt-custom-data": self.license.get("data"),
"user-agent": self.config["client"][self.device]["license_user_agent"],
},
)
response.raise_for_status()
try:
return response.json().get("license")
except ValueError:
return response.content

View File

@ -0,0 +1,12 @@
endpoints:
login: https://api.domain.com/v1/login
metadata: https://api.domain.com/v1/metadata/{title_id}.json
streams: https://api.domain.com/v1/streams
playready_license: https://api.domain.com/v1/license/playready
widevine_license: https://api.domain.com/v1/license/widevine
client:
android_tv:
user_agent: USER_AGENT
license_user_agent: LICENSE_USER_AGENT
type: DATA

View File

@ -0,0 +1,504 @@
# API key for The Movie Database (TMDB)
tmdb_api_key: ""
# Client ID for SIMKL API (optional, improves metadata matching)
# Get your free client ID at: https://simkl.com/settings/developer/
simkl_client_id: ""
# Group or Username to postfix to the end of all download filenames following a dash
tag: user_tag
# Enable/disable tagging with group name (default: true)
tag_group_name: true
# Enable/disable tagging with IMDB/TMDB/TVDB details (default: true)
tag_imdb_tmdb: true
# Set terminal background color (custom option not in CONFIG.md)
set_terminal_bg: false
# Set file naming convention
# true for style - Prime.Suspect.S07E01.The.Final.Act.Part.One.1080p.ITV.WEB-DL.AAC2.0.H.264
# false for style - Prime Suspect S07E01 The Final Act - Part One
scene_naming: true
# Whether to include the year in series names for episodes and folders (default: true)
# true for style - Show Name (2023) S01E01 Episode Name
# false for style - Show Name S01E01 Episode Name
series_year: true
# Check for updates from GitHub repository on startup (default: true)
update_checks: true
# How often to check for updates, in hours (default: 24)
update_check_interval: 24
# Title caching configuration
# Cache title metadata to reduce redundant API calls
title_cache_enabled: true # Enable/disable title caching globally (default: true)
title_cache_time: 1800 # Cache duration in seconds (default: 1800 = 30 minutes)
title_cache_max_retention: 86400 # Maximum cache retention for fallback when API fails (default: 86400 = 24 hours)
# Debug logging configuration
# Comprehensive JSON-based debug logging for troubleshooting and service development
debug:
false # Enable structured JSON debug logging (default: false)
# When enabled with --debug flag or set to true:
# - Creates JSON Lines (.jsonl) log files with complete debugging context
# - Logs: session info, CLI params, service config, CDM details, authentication,
# titles, tracks metadata, DRM operations, vault queries, errors with stack traces
# - File location: logs/unshackle_debug_{service}_{timestamp}.jsonl
# - Also creates text log: logs/unshackle_root_{timestamp}.log
debug_keys:
false # Log decryption keys in debug logs (default: false)
# Set to true to include actual decryption keys in logs
# Useful for debugging key retrieval and decryption issues
# SECURITY NOTE: Passwords, tokens, cookies, and session tokens
# are ALWAYS redacted regardless of this setting
# Only affects: content_key, key fields (the actual CEKs)
# Never affects: kid, keys_count, key_id (metadata is always logged)
# Muxing configuration
muxing:
set_title: false
# Login credentials for each Service
credentials:
# Direct credentials (no profile support)
EXAMPLE: email@example.com:password
# Per-profile credentials with default fallback
SERVICE_NAME:
default: default@email.com:password # Used when no -p/--profile is specified
profile1: user1@email.com:password1
profile2: user2@email.com:password2
# Per-profile credentials without default (requires -p/--profile)
SERVICE_NAME2:
john: john@example.com:johnspassword
jane: jane@example.com:janespassword
# You can also use list format for passwords with special characters
SERVICE_NAME3:
default: ["user@email.com", ":PasswordWith:Colons"]
# Override default directories used across unshackle
directories:
cache: Cache
cookies: Cookies
dcsl: DCSL # Device Certificate Status List
downloads: Downloads
logs: Logs
temp: Temp
wvds: WVDs
prds: PRDs
# Additional directories that can be configured:
# commands: Commands
services:
- /path/to/services
- /other/path/to/services
# vaults: Vaults
# fonts: Fonts
# Pre-define which Widevine or PlayReady device to use for each Service
cdm:
# Global default CDM device (fallback for all services/profiles)
default: WVD_1
# Direct service-specific CDM
DIFFERENT_EXAMPLE: PRD_1
# Per-profile CDM configuration
EXAMPLE:
john_sd: chromecdm_903_l3 # Profile 'john_sd' uses Chrome CDM L3
jane_uhd: nexus_5_l1 # Profile 'jane_uhd' uses Nexus 5 L1
default: generic_android_l3 # Default CDM for this service
# NEW: Quality-based CDM selection
# Use different CDMs based on video resolution
# Supports operators: >=, >, <=, <, or exact match
EXAMPLE_QUALITY:
"<=1080": generic_android_l3 # Use L3 for 1080p and below
">1080": nexus_5_l1 # Use L1 for above 1080p (1440p, 2160p)
default: generic_android_l3 # Optional: fallback if no quality match
# You can mix profiles and quality thresholds in the same service
NETFLIX:
# Profile-based selection (existing functionality)
john: netflix_l3_profile
jane: netflix_l1_profile
# Quality-based selection (new functionality)
"<=720": netflix_mobile_l3
"1080": netflix_standard_l3
">=1440": netflix_premium_l1
# Fallback
default: netflix_standard_l3
# Use pywidevine Serve-compliant Remote CDMs
# Example: Custom CDM API Configuration
# This demonstrates the highly configurable custom_api type that can adapt to any CDM API format
# - name: "chrome"
# type: "custom_api"
# host: "http://remotecdm.test/"
# timeout: 30
# device:
# name: "ChromeCDM"
# type: "CHROME"
# system_id: 34312
# security_level: 3
# auth:
# type: "header"
# header_name: "x-api-key"
# key: "YOUR_API_KEY_HERE"
# custom_headers:
# User-Agent: "Unshackle/2.0.0"
# endpoints:
# get_request:
# path: "/get-challenge"
# method: "POST"
# timeout: 30
# decrypt_response:
# path: "/get-keys"
# method: "POST"
# timeout: 30
# request_mapping:
# get_request:
# param_names:
# scheme: "device"
# init_data: "init_data"
# static_params:
# scheme: "Widevine"
# decrypt_response:
# param_names:
# scheme: "device"
# license_request: "license_request"
# license_response: "license_response"
# static_params:
# scheme: "Widevine"
# response_mapping:
# get_request:
# fields:
# challenge: "challenge"
# session_id: "session_id"
# message: "message"
# message_type: "message_type"
# response_types:
# - condition: "message_type == 'license-request'"
# type: "license_request"
# success_conditions:
# - "message == 'success'"
# decrypt_response:
# fields:
# keys: "keys"
# message: "message"
# key_fields:
# kid: "kid"
# key: "key"
# type: "type"
# success_conditions:
# - "message == 'success'"
# caching:
# enabled: true
# use_vaults: true
# check_cached_first: true
remote_cdm:
- name: "chrome"
device_name: chrome
device_type: CHROME
system_id: 27175
security_level: 3
host: https://domain.com/api
secret: secret_key
- name: "chrome-2"
device_name: chrome
device_type: CHROME
system_id: 26830
security_level: 3
host: https://domain-2.com/api
secret: secret_key
- name: "decrypt_labs_chrome"
type: "decrypt_labs" # Required to identify as DecryptLabs CDM
device_name: "ChromeCDM" # Scheme identifier - must match exactly
device_type: CHROME
system_id: 4464 # Doesn't matter
security_level: 3
host: "https://keyxtractor.decryptlabs.com"
secret: "your_decrypt_labs_api_key_here" # Replace with your API key
- name: "decrypt_labs_l1"
type: "decrypt_labs"
device_name: "L1" # Scheme identifier - must match exactly
device_type: ANDROID
system_id: 4464
security_level: 1
host: "https://keyxtractor.decryptlabs.com"
secret: "your_decrypt_labs_api_key_here"
- name: "decrypt_labs_l2"
type: "decrypt_labs"
device_name: "L2" # Scheme identifier - must match exactly
device_type: ANDROID
system_id: 4464
security_level: 2
host: "https://keyxtractor.decryptlabs.com"
secret: "your_decrypt_labs_api_key_here"
- name: "decrypt_labs_playready_sl2"
type: "decrypt_labs"
device_name: "SL2" # Scheme identifier - must match exactly
device_type: PLAYREADY
system_id: 0
security_level: 2000
host: "https://keyxtractor.decryptlabs.com"
secret: "your_decrypt_labs_api_key_here"
- name: "decrypt_labs_playready_sl3"
type: "decrypt_labs"
device_name: "SL3" # Scheme identifier - must match exactly
device_type: PLAYREADY
system_id: 0
security_level: 3000
host: "https://keyxtractor.decryptlabs.com"
secret: "your_decrypt_labs_api_key_here"
# Key Vaults store your obtained Content Encryption Keys (CEKs)
# Use 'no_push: true' to prevent a vault from receiving pushed keys
# while still allowing it to provide keys when requested
key_vaults:
- type: SQLite
name: Local
path: key_store.db
# Additional vault types:
# - type: API
# name: "Remote Vault"
# uri: "https://key-vault.example.com"
# token: "secret_token"
# no_push: true # This vault will only provide keys, not receive them
# - type: MySQL
# name: "MySQL Vault"
# host: "127.0.0.1"
# port: 3306
# database: vault
# username: user
# password: pass
# no_push: false # Default behavior - vault both provides and receives keys
# Choose what software to use to download data
downloader: aria2c
# Options: requests | aria2c | curl_impersonate | n_m3u8dl_re
# Can also be a mapping:
# downloader:
# NF: requests
# AMZN: n_m3u8dl_re
# DSNP: n_m3u8dl_re
# default: requests
# aria2c downloader configuration
aria2c:
max_concurrent_downloads: 4
max_connection_per_server: 3
split: 5
file_allocation: falloc # none | prealloc | falloc | trunc
# N_m3u8DL-RE downloader configuration
n_m3u8dl_re:
thread_count: 16
ad_keyword: "advertisement"
use_proxy: true
# curl_impersonate downloader configuration
curl_impersonate:
browser: chrome120
# Pre-define default options and switches of the dl command
dl:
sub_format: srt
downloads: 4
workers: 16
lang:
- en
- fr
EXAMPLE:
bitrate: CBR
# Chapter Name to use when exporting a Chapter without a Name
chapter_fallback_name: "Chapter {j:02}"
# Case-Insensitive dictionary of headers for all Services
headers:
Accept-Language: "en-US,en;q=0.8"
User-Agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
# Override default filenames used across unshackle
filenames:
debug_log: "unshackle_debug_{service}_{time}.jsonl" # JSON Lines debug log file
config: "config.yaml"
root_config: "unshackle.yaml"
chapters: "Chapters_{title}_{random}.txt"
subtitle: "Subtitle_{id}_{language}.srt"
# conversion_method:
# - auto (default): Smart routing - subby for WebVTT/SAMI, pycaption for others
# - subby: Always use subby with advanced processing
# - pycaption: Use only pycaption library (no SubtitleEdit, no subby)
# - subtitleedit: Prefer SubtitleEdit when available, fall back to pycaption
# - pysubs2: Use pysubs2 library (supports SRT/SSA/ASS/WebVTT/TTML/SAMI/MicroDVD/MPL2/TMP)
subtitle:
conversion_method: auto
# sdh_method: Method to use for SDH (hearing impaired) stripping
# - auto (default): Try subby (SRT only), then SubtitleEdit (if available), then subtitle-filter
# - subby: Use subby library (SRT only)
# - subtitleedit: Use SubtitleEdit tool (Windows only, falls back to subtitle-filter)
# - filter-subs: Use subtitle-filter library directly
sdh_method: auto
# strip_sdh: Automatically create stripped (non-SDH) versions of SDH subtitles
# Set to false to disable automatic SDH stripping entirely (default: true)
strip_sdh: true
# convert_before_strip: Auto-convert VTT/other formats to SRT before using subtitle-filter
# This ensures compatibility when subtitle-filter is used as fallback (default: true)
convert_before_strip: true
# preserve_formatting: Preserve original subtitle formatting (tags, positioning, styling)
# When true, skips pycaption processing for WebVTT files to keep tags like <i>, <b>, positioning intact
# Combined with no sub_format setting, ensures subtitles remain in their original format (default: true)
preserve_formatting: true
# Configuration for pywidevine's serve functionality
serve:
api_secret: "your-secret-key-here"
users:
secret_key_for_user:
devices:
- generic_nexus_4464_l3
username: user
# devices:
# - '/path/to/device.wvd'
# Configuration data for each Service
services:
# Service-specific configuration goes here
# Profile-specific configurations can be nested under service names
# You can override ANY global configuration option on a per-service basis
# This allows fine-tuned control for services with special requirements
# Supported overrides: dl, aria2c, n_m3u8dl_re, curl_impersonate, subtitle, muxing, headers, etc.
# Example: Comprehensive service configuration showing all features
EXAMPLE:
# Standard service config
api_key: "service_api_key"
# Service certificate for Widevine L1/L2 (base64 encoded)
# This certificate is automatically used when L1/L2 schemes are selected
# Services obtain this from their DRM provider or license server
certificate: |
CAUSwwUKvQIIAxIQ5US6QAvBDzfTtjb4tU/7QxiH8c+TBSKOAjCCAQoCggEBAObzvlu2hZRsapAPx4Aa4GUZj4/GjxgXUtBH4THSkM40x63wQeyVxlEEo
# ... (full base64 certificate here)
# Profile-specific device configurations
profiles:
john_sd:
device:
app_name: "AIV"
device_model: "SHIELD Android TV"
jane_uhd:
device:
app_name: "AIV"
device_model: "Fire TV Stick 4K"
# NEW: Configuration overrides (can be combined with profiles and certificates)
# Override dl command defaults for this service
dl:
downloads: 4 # Limit concurrent track downloads (global default: 6)
workers: 8 # Reduce workers per track (global default: 16)
lang: ["en", "es-419"] # Different language priority for this service
sub_format: srt # Force SRT subtitle format
# Override n_m3u8dl_re downloader settings
n_m3u8dl_re:
thread_count: 8 # Lower thread count for rate-limited service (global default: 16)
use_proxy: true # Force proxy usage for this service
retry_count: 10 # More retries for unstable connections
ad_keyword: "advertisement" # Service-specific ad filtering
# Override aria2c downloader settings
aria2c:
max_concurrent_downloads: 2 # Limit concurrent downloads (global default: 4)
max_connection_per_server: 1 # Single connection per server
split: 3 # Fewer splits (global default: 5)
file_allocation: none # Faster allocation for this service
# Override subtitle processing for this service
subtitle:
conversion_method: pycaption # Use specific subtitle converter
sdh_method: auto
# Service-specific headers
headers:
User-Agent: "Service-specific user agent string"
Accept-Language: "en-US,en;q=0.9"
# Override muxing options
muxing:
set_title: true
# Example: Service with different regions per profile
SERVICE_NAME:
profiles:
us_account:
region: "US"
api_endpoint: "https://api.us.service.com"
uk_account:
region: "GB"
api_endpoint: "https://api.uk.service.com"
# Example: Rate-limited service
RATE_LIMITED_SERVICE:
dl:
downloads: 2 # Limit concurrent downloads
workers: 4 # Reduce workers to avoid rate limits
n_m3u8dl_re:
thread_count: 4 # Very low thread count
retry_count: 20 # More retries for flaky service
aria2c:
max_concurrent_downloads: 1 # Download tracks one at a time
max_connection_per_server: 1 # Single connection only
# Notes on service-specific overrides:
# - Overrides are merged with global config, not replaced
# - Only specified keys are overridden, others use global defaults
# - Reserved keys (profiles, api_key, certificate, etc.) are NOT treated as overrides
# - Any dict-type config option can be overridden (dl, aria2c, n_m3u8dl_re, subtitle, etc.)
# - CLI arguments always take priority over service-specific config
# External proxy provider services
proxy_providers:
nordvpn:
username: username_from_service_credentials
password: password_from_service_credentials
server_map:
us: 12 # force US server #12 for US proxies
surfsharkvpn:
username: your_surfshark_service_username # Service credentials from https://my.surfshark.com/vpn/manual-setup/main/openvpn
password: your_surfshark_service_password # Service credentials (not your login password)
server_map:
us: 3844 # force US server #3844 for US proxies
gb: 2697 # force GB server #2697 for GB proxies
au: 4621 # force AU server #4621 for AU proxies
windscribevpn:
username: your_windscribe_username # Service credentials from https://windscribe.com/getconfig/openvpn
password: your_windscribe_password # Service credentials (not your login password)
server_map:
us: "us-central-096.totallyacdn.com" # force US server
gb: "uk-london-055.totallyacdn.com" # force GB server
basic:
GB:
- "socks5://username:password@bhx.socks.ipvanish.com:1080" # 1 (Birmingham)
- "socks5://username:password@gla.socks.ipvanish.com:1080" # 2 (Glasgow)
AU:
- "socks5://username:password@syd.socks.ipvanish.com:1080" # 1 (Sydney)
- "https://username:password@au-syd.prod.surfshark.com" # 2 (Sydney)
- "https://username:password@au-bne.prod.surfshark.com" # 3 (Brisbane)
BG: "https://username:password@bg-sof.prod.surfshark.com"

118
unshackle/utils/base62.py Normal file
View File

@ -0,0 +1,118 @@
# -*- coding: utf-8 -*-
"""
base62
~~~~~~
Originated from http://blog.suminb.com/archives/558
"""
__title__ = "base62"
__author__ = "Sumin Byeon"
__email__ = "suminb@gmail.com"
__version__ = "1.0.0"
BASE = 62
CHARSET_DEFAULT = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
CHARSET_INVERTED = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
def encode(n, charset=CHARSET_DEFAULT):
"""Encodes a given integer ``n``."""
chs = []
while n > 0:
n, r = divmod(n, BASE)
chs.insert(0, charset[r])
if not chs:
return "0"
return "".join(chs)
def encodebytes(barray, charset=CHARSET_DEFAULT):
"""Encodes a bytestring into a base62 string.
:param barray: A byte array
:type barray: bytes
:rtype: str
"""
_check_type(barray, bytes)
# Count the number of leading zeros.
leading_zeros_count = 0
for i in range(len(barray)):
if barray[i] != 0:
break
leading_zeros_count += 1
# Encode the leading zeros as "0" followed by a character indicating the count.
# This pattern may occur several times if there are many leading zeros.
n, r = divmod(leading_zeros_count, len(charset) - 1)
zero_padding = f"0{charset[-1]}" * n
if r:
zero_padding += f"0{charset[r]}"
# Special case: the input is empty, or is entirely null bytes.
if leading_zeros_count == len(barray):
return zero_padding
value = encode(int.from_bytes(barray, "big"), charset=charset)
return zero_padding + value
def decode(encoded, charset=CHARSET_DEFAULT):
"""Decodes a base62 encoded value ``encoded``.
:type encoded: str
:rtype: int
"""
_check_type(encoded, str)
length, i, v = len(encoded), 0, 0
for x in encoded:
v += _value(x, charset=charset) * (BASE ** (length - (i + 1)))
i += 1
return v
def decodebytes(encoded, charset=CHARSET_DEFAULT):
"""Decodes a string of base62 data into a bytes object.
:param encoded: A string to be decoded in base62
:type encoded: str
:rtype: bytes
"""
leading_null_bytes = b""
while encoded.startswith("0") and len(encoded) >= 2:
leading_null_bytes += b"\x00" * _value(encoded[1], charset)
encoded = encoded[2:]
decoded = decode(encoded, charset=charset)
buf = bytearray()
while decoded > 0:
buf.append(decoded & 0xFF)
decoded //= 256
buf.reverse()
return leading_null_bytes + bytes(buf)
def _value(ch, charset):
"""Decodes an individual digit of a base62 encoded string."""
try:
return charset.index(ch)
except ValueError:
raise ValueError("base62: Invalid character (%s)" % ch)
def _check_type(value, expected_type):
"""Checks if the input is in an appropriate type."""
if not isinstance(value, expected_type):
msg = "Expected {} object, not {}".format(expected_type, value.__class__.__name__)
raise TypeError(msg)

View File

@ -0,0 +1,24 @@
import platform
def get_os_arch(name: str) -> str:
"""Builds a name-os-arch based on the input name, system, architecture."""
os_name = platform.system().lower()
os_arch = platform.machine().lower()
# Map platform.system() output to desired OS name
if os_name == "windows":
os_name = "win"
elif os_name == "darwin":
os_name = "osx"
else:
os_name = "linux"
# Map platform.machine() output to desired architecture
if os_arch in ["x86_64", "amd64"]:
os_arch = "x64"
elif os_arch == "arm64":
os_arch = "arm64"
# Construct the dependency name in the desired format using the input name
return f"{name}-{os_name}-{os_arch}"

Some files were not shown because too many files have changed in this diff Show More