Add Redis queue system for online mode and parameter presets#63
Add Redis queue system for online mode and parameter presets#63t0mdavid-m wants to merge 66 commits intodevelopfrom
Conversation
Bumps [narwhals](http://github.com/narwhals-dev/narwhals) from 1.32.0 to 2.2.0. - [Release notes](http://github.com/narwhals-dev/narwhals/releases) - [Commits](narwhals-dev/narwhals@v1.32.0...v2.2.0) --- updated-dependencies: - dependency-name: narwhals dependency-version: 2.2.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [fonttools](http://github.com/fonttools/fonttools) from 4.56.0 to 4.59.1. - [Release notes](http://github.com/fonttools/fonttools/releases) - [Changelog](http://github.com/fonttools/fonttools/blob/main/NEWS.rst) - [Commits](fonttools/fonttools@4.56.0...4.59.1) --- updated-dependencies: - dependency-name: fonttools dependency-version: 4.59.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [protobuf](http://github.com/protocolbuffers/protobuf) from 5.29.4 to 6.32.0. - [Release notes](http://github.com/protocolbuffers/protobuf/releases) - [Changelog](http://github.com/protocolbuffers/protobuf/blob/main/protobuf_release.bzl) - [Commits](http://github.com/protocolbuffers/protobuf/commits) --- updated-dependencies: - dependency-name: protobuf dependency-version: 6.32.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [packaging](http://github.com/pypa/packaging) from 24.2 to 25.0. - [Release notes](http://github.com/pypa/packaging/releases) - [Changelog](http://github.com/pypa/packaging/blob/main/CHANGELOG.rst) - [Commits](pypa/packaging@24.2...25.0) --- updated-dependencies: - dependency-name: packaging dependency-version: '25.0' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [narwhals](http://github.com/narwhals-dev/narwhals) from 2.3.0 to 2.5.0. - [Release notes](http://github.com/narwhals-dev/narwhals/releases) - [Commits](narwhals-dev/narwhals@v2.3.0...v2.5.0) --- updated-dependencies: - dependency-name: narwhals dependency-version: 2.5.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [streamlit](http://github.com/streamlit/streamlit) from 1.49.0 to 1.49.1. - [Release notes](http://github.com/streamlit/streamlit/releases) - [Commits](streamlit/streamlit@1.49.0...1.49.1) --- updated-dependencies: - dependency-name: streamlit dependency-version: 1.49.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [gitpython](http://github.com/gitpython-developers/GitPython) from 3.1.44 to 3.1.45. - [Release notes](http://github.com/gitpython-developers/GitPython/releases) - [Changelog](http://github.com/gitpython-developers/GitPython/blob/main/CHANGES) - [Commits](gitpython-developers/GitPython@3.1.44...3.1.45) --- updated-dependencies: - dependency-name: gitpython dependency-version: 3.1.45 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [charset-normalizer](http://github.com/jawah/charset_normalizer) from 3.4.1 to 3.4.3. - [Release notes](http://github.com/jawah/charset_normalizer/releases) - [Changelog](http://github.com/jawah/charset_normalizer/blob/master/CHANGELOG.md) - [Commits](jawah/charset_normalizer@3.4.1...3.4.3) --- updated-dependencies: - dependency-name: charset-normalizer dependency-version: 3.4.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [certifi](http://github.com/certifi/python-certifi) from 2025.1.31 to 2025.8.3. - [Commits](certifi/python-certifi@2025.01.31...2025.08.03) --- updated-dependencies: - dependency-name: certifi dependency-version: 2025.8.3 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Initial plan * Fix CAPTCHA input field not resetting after incorrect submission Added session state key management to the CAPTCHA input field to ensure it resets when an incorrect code is submitted. The fix adds a 'key' parameter to the text_input widget and deletes the corresponding session state key when the CAPTCHA is incorrect or correct, ensuring the field is properly cleared on rerun. Fixes issue where input field retained previous text after failed submission. Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> * Fix typo in CAPTCHA error message Changed "Captch is wrong" to "CAPTCHA is wrong" for better clarity. Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com>
* remove temp fix * add contrib libs the CMakeCache.txt * switch to 3.5.0 * fix windows * thirdparty is x86_64 now * remove restrictions for cython and autowrap * disable parquet support * add contrib libraries to path * remove deprecated pyinstaller * set openms share path
* enable thirdparty tools for template * add delayed expansion local
* Initial plan * Pin autowrap version to <=0.24 in Dockerfile Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com>
* Add Redis queue implementation plan for online mode This document outlines the architecture and implementation steps for introducing a Redis-based job queueing system to online deployments: - Uses RQ (Redis Queue) for task management - Docker Compose additions for Redis, worker, and dashboard services - New QueueManager class for queue interactions - Worker tasks module for background execution - Health monitoring utilities - Backward compatible with local mode (multiprocessing fallback) * Simplify to single-container architecture for Redis queue Update implementation plan to run Redis server and RQ worker within the same Docker container as the Streamlit app: - Redis server runs as background process via entrypoint script - RQ worker runs in same container with identical environment - No docker-compose orchestration complexity - All communication via localhost - Optional supervisord config for robust process management - Simplified deployment and debugging * Expand implementation plan with offline mode, worker config, and UX details - Add "Design Principles" section explaining plug & play architecture - Document offline mode (Windows installer) compatibility - zero changes needed - Add comprehensive "Configuring Worker Count" section with multiple methods - Add "User Experience: Queue Status Display" section showing what users see - Add "Sidebar Metrics" section for queue monitoring alongside CPU/RAM - Include code examples for all UI components * Implement Redis queue system for online mode workflows New files: - src/workflow/QueueManager.py: Redis queue interaction layer - src/workflow/tasks.py: Worker task definitions for RQ - src/workflow/health.py: Queue health check utilities Modified files: - Dockerfile: Install Redis server, add entrypoint for Redis/RQ workers - requirements.txt: Add redis and rq packages - settings.json: Add queue_settings configuration - src/workflow/WorkflowManager.py: Add queue support with local fallback - src/workflow/StreamlitUI.py: Add queue status display in execution section - src/common/common.py: Add sidebar queue metrics for online mode Features: - Automatic queue mode in online deployment, local mode unchanged - Queue position and progress display when workflows are queued - Sidebar metrics showing worker utilization and queue depth - Configurable worker count via RQ_WORKER_COUNT env variable - Graceful fallback to multiprocessing if Redis unavailable * Update settings.json Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix redis queue handling * minor fixes * minor fixes --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Add demo_workspaces configuration in settings.json with options for enabling, source directory, and auto-loading for online sessions - Create example-data/workspaces/default/ with pre-configured mzML files and params.json for immediate demo experience - Modify page_setup() to detect and copy demo workspaces: - When a demo workspace name is requested via URL, copy to new UUID workspace - Auto-load default demo for new online sessions when configured - Copy demo workspaces for local mode when workspaces are disabled - Add sidebar UI for online users to load demo data into current workspace - Add helper functions get_available_demo_workspaces() and copy_demo_workspace() This migrates the demo workspace handling pattern from FLASHApp to the template, allowing derivative apps to ship with pre-configured example data that users can explore without uploading their own files.
Users must now explicitly load demo data via the sidebar UI or access a demo workspace URL (e.g., ?workspace=default). New online sessions start with an empty workspace.
- Add get_demo_source_dirs() helper to normalize config and return list of existing directories - Add find_demo_workspace_path() to search directories in order - Update get_available_demo_workspaces() to collect demos from all directories (first occurrence wins for naming conflicts) - Update copy_demo_workspace() to use multi-directory lookup - Refactor page_setup() and render_sidebar() to use new helper functions - Change settings.json from source_dir (string) to source_dirs (array) - Maintain backward compatibility with legacy source_dir format This enables mounting external demo directories (e.g., Docker volumes) alongside bundled demos. Non-existent directories are silently skipped.
- Rename example-data/workspaces/default to example_demo - Remove pre-loaded mzML files from demo workspace (~30MB savings) - Update params.json with empty file selection - Add .gitkeep to preserve empty mzML-files directory The demo workspace now serves as an empty template. Users can load example data via the existing "Load Example" button in fileupload.py which uses example-data/mzML/ as the source.
- Add is_safe_workspace_name() to reject names with path separators or parent directory references (/, \, .., .) - Apply validation in find_demo_workspace_path() to prevent demo name path traversal attacks - Apply validation in page_setup() for workspace query parameter, falling back to new UUID workspace if invalid Addresses Copilot security review comments #1 and #2.
add callable and reactive parameters
…oZQa Plan demo workspace migration to template
The multiselect widget returns a list, but TOPP tools expect newline-separated strings. Use a separate key for the multiselect widget and sync to the original key as a newline-separated string via on_change callback. https://claude.ai/code/session_016FpXUEwfPanXWu4CLrFLSA
Use default= parameter directly instead of manual session state initialization, more consistent with how selectbox uses index= parameter. https://claude.ai/code/session_016FpXUEwfPanXWu4CLrFLSA
Skip keys ending in _display in save_parameters() to prevent multiselect display keys from being incorrectly saved to JSON or compared with INI values. https://claude.ai/code/session_016FpXUEwfPanXWu4CLrFLSA
Adds a new "Save as Demo" option in the sidebar (online mode only) that allows password-protected saving of the current workspace as a demo. The admin password is stored in .streamlit/secrets.toml which is gitignored. - Add src/common/admin.py with admin utilities (password verification, demo saving with symlink handling) - Add .streamlit/secrets.toml.example as a template - Modify sidebar to show Save as Demo expander in online mode - Warn and require confirmation when overwriting existing demos https://claude.ai/code/session_01JzDsFmXBRSFdvGV4TgqXwk
…7bm5 Fix multiselect parameter format for TOPP tool compatibility
When copying demo files on Linux, params.json is now copied as a regular file instead of being symlinked. This allows the params.json file to be modified independently of the source demo data. https://claude.ai/code/session_012dUQQTySG19toAR3FH7xme
.ini files contain TOPP tool parameter configurations that users may modify. Like params.json, these should be copied rather than symlinked to prevent users from affecting each other in multi-user environments. https://claude.ai/code/session_012dUQQTySG19toAR3FH7xme
Handle params.json as copyable file instead of symlink
Defer the import of is_safe_workspace_name inside save_workspace_as_demo to avoid circular dependency between admin.py and common.py. https://claude.ai/code/session_01JzDsFmXBRSFdvGV4TgqXwk
…re-ZzPh9 Add admin functionality to save workspaces as demo templates
Changed stderr logging level from 2 to 0 so that error output from workflows appears in the minimal log file, making it easier for users to see errors without viewing full logs. https://claude.ai/code/session_01L9vBwhCPsXUTTxen3sW8rn Co-authored-by: Claude <noreply@anthropic.com>
Previously only .streamlit/config.toml was copied. This change copies the entire .streamlit folder including credentials.toml and secrets.toml.example files. https://claude.ai/code/session_01VQFwSZJHdHDfhfQxF4FqiQ Co-authored-by: Claude <noreply@anthropic.com>
Changed truthiness check `if v:` to explicit `if v != "" and v is not None:` to properly handle numeric zero values (0, 0.0) which are valid parameters. Previously, setting fragment_bin_offset=0 would skip the value entirely, causing malformed command lines where the next flag was interpreted as the previous parameter's value.
…i() (#327) - Add create_ini() method to ParameterManager for creating TOPP tool ini files - Update save_parameters() to use create_ini() instead of skipping missing ini files - Update input_TOPP() in StreamlitUI to use the shared method - Removes code duplication and enables ini creation during parameter save
…331) The embeddable Python distribution lacks development headers (Python.h) required to compile native extensions. When using the embeddable Python's pip directly, packages with native code fail to build with: fatal error C1083: Cannot open include file: 'Python.h' This fix uses the system Python (which has headers) to compile packages, while directing the output to the embeddable Python's site-packages directory using pip's --target flag. Changes: - build-windows-executable-app.yaml: Use system Python with --target, remove unnecessary pip installation in embeddable Python - test-win-exe-w-embed-py.yaml: Add setup-python action, use system Python with --target, remove pip installation step - docs/win_exe_with_embed_py.md: Add prerequisites section explaining the need for system Python, update install command to use --target https://claude.ai/code/session_0152zcfBS5dWLtM12ubuYrKC Co-authored-by: Claude <noreply@anthropic.com>
* feat: add parameter presets system for quick workflow configuration Adds a new presets feature that allows users to quickly apply optimized parameter configurations with a single click. Key changes: - ParameterManager: Added load_presets(), get_preset_names(), get_preset_description(), and apply_preset() methods - StreamlitUI: Added preset_buttons() method that renders a grid of preset buttons with tooltips - parameter_section() now displays preset buttons when available - Created presets.json with example presets for the TOPP workflow (High Sensitivity, High Specificity, Fast Analysis, Metabolomics Default) The feature is opt-in and backward compatible: workflows without a presets.json file will not show any preset buttons. https://claude.ai/code/session_01BnipAqTf16J7kJVfnzah2U * refactor: pass workflow name explicitly to ParameterManager - ParameterManager.__init__() now accepts optional workflow_name param - WorkflowManager passes workflow name directly to ParameterManager - load_presets() normalizes workflow name (lowercase, hyphens) for lookup - Added tests for workflow_name parameter functionality This makes the preset lookup contract clearer: presets.json keys should use the normalized format (e.g., "topp-workflow" for "TOPP Workflow"). https://claude.ai/code/session_01BnipAqTf16J7kJVfnzah2U * build: add presets.json to Docker and Windows deployments Include presets.json in all deployment configurations: - Dockerfile (full OpenMS build) - Dockerfile_simple (pyopenms-only build) - Windows PyInstaller workflow - Windows deployment documentation https://claude.ai/code/session_01BnipAqTf16J7kJVfnzah2U * fix: use st.rerun() instead of full page reload for presets Replace streamlit_js_eval page reload with st.rerun() for smoother UX: - No page flash/flicker - Scroll position preserved - Faster (no re-fetching static assets) Uses "delete-then-rerun" pattern: apply_preset() now deletes session_state keys instead of setting them, forcing widgets to re-initialize fresh from params.json. This avoids fragment caching issues. Changes: - ParameterManager.apply_preset(): delete keys instead of setting them - ParameterManager.clear_parameter_session_state(): new helper method - StreamlitUI: preset_buttons(), "Load default parameters", and "Import parameters" now use st.rerun() with st.toast() for feedback - Removed unused streamlit_js_eval import - Updated tests for delete-based behavior https://claude.ai/code/session_01BnipAqTf16J7kJVfnzah2U * docs: add parameter presets documentation Document the new presets feature in three locations: - docs/user_guide.md: User-facing guide on how to use preset buttons - docs/build_app.md: Developer guide on presets.json format and structure - docs/toppframework.py: Interactive TOPP framework docs with st.help() https://claude.ai/code/session_01BnipAqTf16J7kJVfnzah2U --------- Co-authored-by: Claude <noreply@anthropic.com>
…#332) - Add max_threads configuration to settings.json with separate values for local (4) and online (2) deployments - Add get_max_threads() helper function in common.py that returns appropriate thread count based on deployment mode - Add Threads number input in parameter_section for local mode UI - Update CommandExecutor.run_multiple_commands() to use semaphore limiting based on max_threads setting - Update CommandExecutor.run_topp() to pass -threads parameter to TOPP tools with intelligently distributed thread allocation Thread distribution algorithm: - parallel_commands = min(num_files, max_threads) - threads_per_command = max(1, max_threads // parallel_commands) This ensures optimal resource usage: with 4 max_threads and 2 files, runs 2 parallel commands with 2 threads each; with 4 files, runs 4 parallel commands with 1 thread each. https://claude.ai/code/session_01XGuw7AxoKRWHr9EZX1SDXi Co-authored-by: Claude <noreply@anthropic.com>
Buffer stderr during process execution and only write to minimal.log if the process exits with non-zero code. Stderr is still logged to all.log in real-time for debugging purposes. https://claude.ai/code/session_01UbaxXTo3jzy3rk6xvdWU1a Co-authored-by: Claude <noreply@anthropic.com>
* fix: Windows executable build CAB file limit Use MediaTemplate to automatically split files into multiple CAB archives and clean up unnecessary Python files (__pycache__, tests, docs) to reduce file count below the 65,535 CAB limit. * fix: Keep docs and .pyc files in cleanup step
* feat: add configurable max threads for local and online deployments - Add max_threads configuration to settings.json with separate values for local (4) and online (2) deployments - Add get_max_threads() helper function in common.py that returns appropriate thread count based on deployment mode - Add Threads number input in parameter_section for local mode UI - Update CommandExecutor.run_multiple_commands() to use semaphore limiting based on max_threads setting - Update CommandExecutor.run_topp() to pass -threads parameter to TOPP tools with intelligently distributed thread allocation Thread distribution algorithm: - parallel_commands = min(num_files, max_threads) - threads_per_command = max(1, max_threads // parallel_commands) This ensures optimal resource usage: with 4 max_threads and 2 files, runs 2 parallel commands with 2 threads each; with 4 files, runs 4 parallel commands with 1 thread each. https://claude.ai/code/session_01XGuw7AxoKRWHr9EZX1SDXi * fix: properly initialize session state for Threads number input Pre-initialize max_threads_override in session state before rendering the widget, and remove the value parameter to let Streamlit read from session state. This ensures user changes are properly tracked. https://claude.ai/code/session_01XGuw7AxoKRWHr9EZX1SDXi * fix: persist Threads parameter across page navigation Use input_widget() which integrates with the parameter manager to persist the value to params.json. Also copy value to session state for get_max_threads() to access. https://claude.ai/code/session_01XGuw7AxoKRWHr9EZX1SDXi * refactor: use parameter manager directly in get_max_threads() - Pass parameter_manager to get_max_threads() to read persisted value - Remove session state copying workaround in StreamlitUI - Cleaner design that reads directly from params.json https://claude.ai/code/session_01XGuw7AxoKRWHr9EZX1SDXi * refactor: move _get_max_threads() to CommandExecutor - Move function from common.py to CommandExecutor as private method - Simplifies code by using self.parameter_manager directly - Remove unnecessary import and parameter passing https://claude.ai/code/session_01XGuw7AxoKRWHr9EZX1SDXi * fix: validate _get_max_threads() return value Coerce to int and clamp to minimum of 1 to prevent: - Semaphore(0) deadlocks - Divide-by-zero in threads_per_command calculation - Issues from non-numeric config values https://claude.ai/code/session_01XGuw7AxoKRWHr9EZX1SDXi * style: move streamlit import to top of file https://claude.ai/code/session_01XGuw7AxoKRWHr9EZX1SDXi * feat: log warning for invalid max_threads values Logs to minimal log (level 0) when max_threads is non-numeric or < 1. https://claude.ai/code/session_01XGuw7AxoKRWHr9EZX1SDXi * simplify: remove warning logging from _get_max_threads() UI already enforces min_value=1, keep just the validation. https://claude.ai/code/session_01XGuw7AxoKRWHr9EZX1SDXi * simplify: remove exception handling, fail loud on invalid config https://claude.ai/code/session_01XGuw7AxoKRWHr9EZX1SDXi --------- Co-authored-by: Claude <noreply@anthropic.com>
…ents (#336) * Add nginx load balancer for scaling Streamlit in a single container When STREAMLIT_SERVER_COUNT > 1, the entrypoint dynamically generates an nginx config and launches multiple Streamlit instances on internal ports (8510+), with nginx on port 8501 using ip_hash sticky sessions for WebSocket compatibility. Default (STREAMLIT_SERVER_COUNT=1) preserves existing behavior with no nginx overhead. https://claude.ai/code/session_018VEL5xKZfe4LCcUa8iUHJ9 * Fix nginx config: create /etc/nginx directory before writing config https://claude.ai/code/session_018VEL5xKZfe4LCcUa8iUHJ9 * Fix nginx: use absolute path /usr/sbin/nginx The mamba environment activation shadows system binaries on the PATH. https://claude.ai/code/session_018VEL5xKZfe4LCcUa8iUHJ9 * Switch nginx from ip_hash to least_conn for load balancing ip_hash pins all users behind the same NAT/VPN/reverse-proxy to a single backend, defeating the load balancer. least_conn distributes new connections to the instance with fewest active connections, and once a WebSocket is established it stays on that backend for the session lifetime, so sticky sessions are not needed. https://claude.ai/code/session_018VEL5xKZfe4LCcUa8iUHJ9 * Fix file uploads: disable nginx client_max_body_size limit nginx defaults to 1MB max body size, which blocks Streamlit file uploads with a 400 error. Set to 0 (unlimited) to let Streamlit enforce its own 200MB limit from config.toml. https://claude.ai/code/session_018VEL5xKZfe4LCcUa8iUHJ9 * Fix file uploads: switch to hash-based sticky sessions least_conn routes each HTTP request independently, so the file upload POST (/_stcore/upload_file) can land on a different backend than the WebSocket session, causing a 400 error. Use hash $remote_addr$http_x_forwarded_for consistent instead: - Provides session affinity so uploads hit the correct backend - Behind a reverse proxy: XFF header differentiates real client IPs - Direct connections: falls back to remote_addr (like ip_hash) - "consistent" minimizes redistribution when backends are added/removed https://claude.ai/code/session_018VEL5xKZfe4LCcUa8iUHJ9 * Implement cookie-based sticky sessions for nginx load balancer Replace ip_hash/hash-on-IP with cookie-based session affinity using nginx's built-in map and $request_id: - map $cookie_stroute $route_key: if browser has a "stroute" cookie, reuse its value; otherwise fall back to $request_id (a unique random hex string nginx generates per-request) - hash $route_key consistent: route based on the cookie/random value - add_header Set-Cookie on every response to persist the routing key This ensures each browser gets its own sticky backend regardless of source IP, fixing both: - File uploads (POST must hit the same backend as the WebSocket session) - Load distribution when all users share the same IP (NAT/VPN/proxy) No new packages required - uses only built-in nginx directives. https://claude.ai/code/session_018VEL5xKZfe4LCcUa8iUHJ9 --------- Co-authored-by: Claude <noreply@anthropic.com>
Change the nginx listen directive from `listen 8501` to `listen 0.0.0.0:8501` so the load balancer explicitly accepts connections on all network interfaces, not just the default. https://claude.ai/code/session_01721rHYJhSZfJnnUpWHXbVs Co-authored-by: Claude <noreply@anthropic.com>
In single-instance mode, `streamlit run app.py` was invoked without `--server.address`, causing Streamlit to default to localhost and be unreachable from outside the container. This adds `--server.address 0.0.0.0` to the single-instance entrypoint in both Dockerfiles, sets `address = "0.0.0.0"` in .streamlit/config.toml, and makes the nginx listen directive in Dockerfile_simple explicitly bind 0.0.0.0. https://claude.ai/code/session_01GX2U6UAAj8sF1smN9Lc4J5 Co-authored-by: Claude <noreply@anthropic.com>
* Bind Streamlit to all interfaces (0.0.0.0) in all Docker modes In single-instance mode, `streamlit run app.py` was invoked without `--server.address`, causing Streamlit to default to localhost and be unreachable from outside the container. This adds `--server.address 0.0.0.0` to the single-instance entrypoint in both Dockerfiles, sets `address = "0.0.0.0"` in .streamlit/config.toml, and makes the nginx listen directive in Dockerfile_simple explicitly bind 0.0.0.0. https://claude.ai/code/session_01GX2U6UAAj8sF1smN9Lc4J5 * Bind multi-instance Streamlit to all interfaces (0.0.0.0) The load-balanced Streamlit instances were still using --server.address 127.0.0.1, causing them to only listen on localhost. Change to 0.0.0.0 so Streamlit listens on all IPs in all cases (single-instance and multi-instance). https://claude.ai/code/session_01GX2U6UAAj8sF1smN9Lc4J5 --------- Co-authored-by: Claude <noreply@anthropic.com>
* Add Matomo Tag Manager as third analytics tracking mode Adds Matomo Tag Manager support alongside existing Google Analytics and Piwik Pro integrations. Includes settings.json configuration (url + tag), build-time script injection via hook-analytics.py, Klaro GDPR consent banner integration, and runtime consent granting via MTM data layer API. https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * Fix Matomo Tag Manager snippet to match official docs - Accept full container JS URL instead of separate url + tag fields, supporting both self-hosted and Matomo Cloud URL patterns - Match the official snippet: var _mtm alias, _mtm.push shorthand - Remove redundant type="text/javascript" attribute - Remove unused "tag" field from settings.json https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * Split Matomo config into base url + tag fields Separate the Matomo setting into `url` (base URL, e.g. https://cdn.matomo.cloud/openms.matomo.cloud) and `tag` (container ID, e.g. yDGK8bfY), consistent with how other providers use a tag field. The script constructs the full path: {url}/container_{tag}.js https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * install matomo tag --------- Co-authored-by: Claude <noreply@anthropic.com>
* Initial plan * fix: remove duplicate address entry in config.toml Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com>
Resolves merge conflicts between FLASHApp customizations and upstream template improvements. Key decisions: - Keep FLASHApp-specific: TOPP_TOOLS, vue-js-component build, polars handling, OpenMS fork/branch, pinned cython/autowrap versions - Take template: Qt install optimization, Literal/Callable imports, address binding, Redis/RQ deps, demo workspace loader, save-as-demo, spectrum plotting settings, queue dispatch, import time - Combine: maxUploadSize=2000 + address=0.0.0.0, boolean flag handling + empty/None/zero params, st.status + progress bar + queue awareness, double-press guard inside _start_workflow_local + queue dispatch - Delete: 6 docs + Dockerfile_simple + test-win-exe-w-pyinstaller + export_consensus_feature_df.py (all previously removed from FLASHApp) https://claude.ai/code/session_011JfuSn9A9sfCB1egKiyjJR
📝 WalkthroughWalkthroughThis PR introduces a Redis-backed job queue system for online deployments, enhances Windows CI/CD workflows, adds parameter presets and demo workspace management, integrates Matomo analytics, and refactors workflow execution to support both queued and local fallback modes. Changes
Sequence Diagram(s)sequenceDiagram
participant UI as Streamlit UI
participant WM as WorkflowManager
participant QM as QueueManager
participant Redis as Redis/RQ
participant Worker as Worker Process
participant Workflow as Workflow Engine
UI->>WM: execution() [online mode]
WM->>WM: Detect online mode
WM->>QM: submit_job(execute_workflow, ...)
QM->>Redis: enqueue job
Redis-->>QM: job_id
QM->>QM: store_job_id(workflow_dir)
QM-->>WM: job_id
WM->>UI: Return execution status
par Status Polling
UI->>WM: get_workflow_status()
WM->>QM: get_job_info(job_id)
QM->>Redis: fetch job details
Redis-->>QM: job metadata
QM-->>WM: JobInfo object
WM-->>UI: status dict
end
par Worker Execution
Redis->>Worker: Dequeue job
Worker->>Workflow: execute_workflow(params)
Workflow->>Workflow: Load parameters, run execution
Workflow-->>Worker: Result dict
Worker->>Redis: Update job status/progress
end
alt Job Success
UI->>WM: get_workflow_status()
WM->>QM: get_job_info(job_id)
QM->>Redis: Fetch completed job
Redis-->>QM: FINISHED status + result
QM-->>UI: Display results
else Job Failure
QM->>Redis: FAILED status + error
QM-->>UI: Display error message
end
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 16
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/workflow/CommandExecutor.py (1)
82-105:⚠️ Potential issue | 🟠 MajorThread startup failures can be reported as success.
If
run_command()raises inside a worker thread before it returnsFalse—for example, ifsubprocess.Popen()fails—that thread never appends a result.all(results)can then still evaluate toTrue, so a partially started batch is treated as fully successful.💡 Suggested hardening
- results = [] + results: list[bool] = [] lock = threading.Lock() semaphore = threading.Semaphore(parallel_commands) def run_and_track(cmd): with semaphore: - success = self.run_command(cmd) + try: + success = self.run_command(cmd) + except Exception as e: + self.logger.log( + f"ERROR: failed to start command {' '.join(map(str, cmd))}: {e}", + 0, + ) + success = False with lock: results.append(success) @@ - return all(results) + return len(results) == num_commands and all(results)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/workflow/CommandExecutor.py` around lines 82 - 105, The worker thread wrapper run_and_track should ensure every thread appends a result even if run_command raises; modify run_and_track to catch Exception around self.run_command(cmd), log the exception (use self.logger) and append False under the same lock so results length matches threads, then re-use the existing semaphore and join logic; reference run_and_track, self.run_command, results, lock, semaphore and the final all(results) check to locate where to add the try/except and logging so a thread-start failure cannot be misreported as success.
🧹 Nitpick comments (12)
src/common/captcha_.py (1)
248-248: Avoid duplicating the CAPTCHA length.
max_chars=5can drift fromlength_captcha. Reuse the constant here so the widget stays aligned if the CAPTCHA length changes later.♻️ Proposed cleanup
- capta2_text = c1.text_input("Enter captcha text", max_chars=5, key="captcha_input") + capta2_text = c1.text_input( + "Enter captcha text", + max_chars=length_captcha, + key="captcha_input", + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/common/captcha_.py` at line 248, The text_input call for the CAPTCHA uses a hardcoded max_chars=5 which can drift from the CAPTCHA length constant; update the c1.text_input invocation (where capta2_text is defined) to use the existing length_captcha constant instead of the literal 5 so the input widget stays in sync with the CAPTCHA generation logic.Dockerfile (1)
189-205: Supervise the background queue processes.Redis, the
rq workers, and any extra Streamlit instances are all backgrounded, but PID 1 onlyexecs nginx/Streamlit. If one of those children exits later, the container stays up while online execution is broken. Consider a supervisor or a small wait/healthcheck loop so the deployment fails fast instead of serving a degraded stack.Also applies to: 225-239
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Dockerfile` around lines 189 - 205, Backgrounded processes (redis-server and rq worker(s) started using WORKER_COUNT / RQ_WORKER_COUNT) are not supervised so if any child dies the container remains running under PID 1 (nginx/Streamlit) and the stack is degraded; change the startup script to capture PIDs for redis-server and each rq worker, install signal handlers to forward SIGTERM/SIGINT to those PIDs, and run a supervisor loop that waits for any child to exit (e.g., using wait -n or polling) and then exits non‑zero so the container fails fast; ensure the same supervision is applied to any extra Streamlit instances and that nginx/Streamlit (PID 1) is only exec'd after the monitor is in place or replace it by an exec of a tiny supervisor that launches nginx/Streamlit and the background services.src/workflow/ParameterManager.py (2)
23-23: Use explicitOptionaltype hint.PEP 484 prohibits implicit
Optional. Use explicit union type for clarity.♻️ Proposed fix
- def __init__(self, workflow_dir: Path, workflow_name: str = None): + def __init__(self, workflow_dir: Path, workflow_name: str | None = None):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/workflow/ParameterManager.py` at line 23, The __init__ signature in class ParameterManager uses an implicit Optional by setting workflow_name: str = None; change it to an explicit Optional type (e.g., workflow_name: Optional[str] or Union[str, None]) and add the corresponding import from typing (Optional or Union) at the top of the file so the constructor reads something like def __init__(self, workflow_dir: Path, workflow_name: Optional[str] = None): and type-checkers will accept it.
131-133: Catch specific exception types instead of bareexcept.The bare
exceptclause catches all exceptions includingSystemExitandKeyboardInterrupt. Specify the expected exception types.♻️ Proposed fix
try: with open(self.params_file, "r", encoding="utf-8") as f: return json.load(f) - except: + except (json.JSONDecodeError, IOError, OSError): st.error("**ERROR**: Attempting to load an invalid JSON parameter file. Reset to defaults.") return {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/workflow/ParameterManager.py` around lines 131 - 133, The bare except in ParameterManager (the JSON parameter file load function) should be replaced with specific exception types; update the except to catch json.JSONDecodeError (for invalid JSON) and OSError/FileNotFoundError (for file I/O issues) instead of a blanket except, ensure json is imported if not already, and keep the existing st.error message and return {} inside that specific except block so only expected JSON/file errors are handled while allowing SystemExit/KeyboardInterrupt to propagate.src/common/common.py (2)
408-410: Use truthiness check instead of equality comparison toTrue.Per PEP 8, prefer truthiness checks over explicit
== Truecomparisons.♻️ Proposed fix
- if (st.session_state.settings["analytics"]["matomo"]["enabled"]) and ( - st.session_state.tracking_consent["matomo"] == True - ): + if (st.session_state.settings["analytics"]["matomo"]["enabled"]) and ( + st.session_state.tracking_consent["matomo"] + ):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/common/common.py` around lines 408 - 410, The if-statement in common.py uses an explicit equality check against True (st.session_state.tracking_consent["matomo"] == True); change this to a truthiness check (e.g., just use st.session_state.tracking_consent["matomo"]) so the condition reads: if (st.session_state.settings["analytics"]["matomo"]["enabled"]) and (st.session_state.tracking_consent["matomo"]): — update the condition in the block where this check appears (referenced by the existing if statement) to follow PEP 8 truthiness style.
758-764: Use truthiness check instead of equality comparison toTrue.Same pattern as above - prefer truthiness check.
♻️ Proposed fix
- if st.session_state["spectrum_bin_peaks"] == True: + if st.session_state["spectrum_bin_peaks"] is True:Note: Using
is Truehere since the selectbox can also return the string"auto", and we want to check for the booleanTruespecifically rather than general truthiness.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/common/common.py` around lines 758 - 764, The condition currently compares st.session_state["spectrum_bin_peaks"] == True; change this to an identity check so it specifically detects the boolean True value returned by st.selectbox (i.e., use "is True") and leave the st.number_input and default assignment to st.session_state["spectrum_num_bins"] unchanged; update the condition in the block that uses st.selectbox("Bin Peaks", ["auto", True, False"]) and the subsequent if controlling st.number_input("Number of Bins (m/z)", ...) so it reads the identity check against True.src/common/admin.py (1)
146-151: Consider more specific exception handling.The broad
Exceptioncatch at line 150 could mask unexpected errors. Consider catching more specific exceptions or logging the unexpected ones for debugging.♻️ Proposed improvement
except PermissionError: return False, "Permission denied. Cannot write to demo directory." except OSError as e: - return False, f"Failed to save demo: {str(e)}" - except Exception as e: - return False, f"Unexpected error: {str(e)}" + return False, f"Failed to save demo: {e!s}"If you need to catch unexpected exceptions for robustness, consider logging them:
except Exception as e: import logging logging.exception("Unexpected error saving demo") return False, f"Unexpected error: {e!s}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/common/admin.py` around lines 146 - 151, The catch-all except Exception block in the save demo flow (the except following PermissionError and OSError) is too broad and should either be narrowed or at minimum log the full traceback before returning the user-friendly message; update the generic handler (the final except block) to catch specific unexpected error types where possible or call logging.exception("Unexpected error saving demo") (or use the module logger) to record the traceback, then return the existing False, f"Unexpected error: {e!s}" message so debugging info is preserved while keeping the user-facing string unchanged.src/workflow/WorkflowManager.py (1)
32-35: Forward reference type hint needs import orTYPE_CHECKINGguard.The type hint
'QueueManager'is a string forward reference, but Ruff flags it as undefined. UseTYPE_CHECKINGto properly handle the conditional import.♻️ Proposed fix
from pathlib import Path -from typing import Optional +from typing import Optional, TYPE_CHECKING from .Logger import Logger # ... other imports ... +if TYPE_CHECKING: + from .QueueManager import QueueManager + class WorkflowManager: # ... # Initialize queue manager for online mode - self._queue_manager: Optional['QueueManager'] = None + self._queue_manager: Optional[QueueManager] = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/workflow/WorkflowManager.py` around lines 32 - 35, The forward-reference string type hint 'QueueManager' on self._queue_manager is flagged as undefined; import typing.TYPE_CHECKING and, inside an if TYPE_CHECKING: block, import QueueManager (or its module) so the name exists for static checkers while avoiding runtime import cost, leaving the runtime annotation as Optional['QueueManager'] and keep the initialization in _init_queue_manager and the WorkflowManager class unchanged.tests/test_parameter_presets.py (1)
380-390: Consider completing the parameter value validation.The loop iterates over parameter values but doesn't validate them. If you intend to validate the structure completely, consider adding type checks for
param_value. Otherwise, rename to_param_valueto indicate it's intentionally unused.♻️ Option 1: Add validation for param_value
# Tool parameters dict for param_name, param_value in value.items(): assert isinstance(param_name, str) + # Param values should be primitives (str, int, float, bool) + assert isinstance(param_value, (str, int, float, bool))♻️ Option 2: Rename to indicate intentionally unused
- for param_name, param_value in value.items(): + for param_name, _param_value in value.items(): assert isinstance(param_name, str)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_parameter_presets.py` around lines 380 - 390, The test iterates preset_config -> (in tests/test_parameter_presets.py) over param_name, param_value but never validates param_value; either add assertions to validate allowed types (e.g., ensure param_value is one of expected primitives/containers such as str, int, float, bool, list, dict) inside the loop that iterates value.items(), or if the value is intentionally unused rename param_value to _param_value to signal that no validation is required; update the block that contains for param_name, param_value in value.items(): accordingly.src/workflow/StreamlitUI.py (3)
384-385: Use explicitOptionalfor nullable callable type.Per PEP 484, implicit
Optionalis prohibited. The type hint should explicitly indicate the parameter can beNone.Proposed fix
- on_change: Callable = None, + on_change: Callable | None = None,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/workflow/StreamlitUI.py` around lines 384 - 385, The parameter on_change in the function signature is typed as Callable = None which violates PEP 484; change the annotation to use Optional[Callable] (i.e., on_change: Optional[Callable] = None) and add the corresponding Optional import from typing (or typing.Optional) at the top of StreamlitUI.py so the nullable callable is explicitly represented; update any other occurrences of this pattern in the same module if present.
1327-1327: Unused variablemsg_typeshould be prefixed with underscore.The
msg_typevalue from the status mapping is never used. Either prefix it with an underscore to indicate it's intentionally unused, or remove it from the unpacking.Proposed fix
- label, msg_type = status_display.get(job_status, ("Unknown", "info")) + label, _msg_type = status_display.get(job_status, ("Unknown", "info"))Or if
msg_typewas intended to be used for dynamic styling:label, msg_type = status_display.get(job_status, ("Unknown", "info")) + # Could use msg_type here for dynamic st.{msg_type}() calls if desired🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/workflow/StreamlitUI.py` at line 1327, The unpacking of status_display into label, msg_type yields an unused msg_type; update the assignment in StreamlitUI so the unused variable is clearly marked or removed — either unpack as label, _msg_type = status_display.get(job_status, ("Unknown", "info")) to indicate it is intentionally unused, or only extract the label via label = status_display.get(job_status, ("Unknown", "info"))[0]; locate the use near the status_display lookup in StreamlitUI and apply one of these changes (or if msg_type was meant for styling, wire it into the display logic where the label is rendered).
565-565: Useis Nonefor None comparison.PEP 8 recommends using
is/is notfor singleton comparisons likeNone.Proposed fix
- elif (isinstance(value, str) or value == None) and options is not None: + elif (isinstance(value, str) or value is None) and options is not None:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/workflow/StreamlitUI.py` at line 565, In the conditional in StreamlitUI.py that currently reads "elif (isinstance(value, str) or value == None) and options is not None", replace the equality None checks with identity checks so it becomes "elif (isinstance(value, str) or value is None) and options is not None" (and optionally change "options is not None" to use "is not None" if not already) to follow PEP8; locate and update the conditional that references the variables value and options in the relevant widget/rendering function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/build-windows-executable-app.yaml:
- Around line 289-290: The second echo line overwrites the batch file created by
the first echo (so '@echo off' is lost); change the second redirection that
writes "setlocal EnableDelayedExpansion" to append (use >>) to the same file
(referencing the lines that write to ${{ env.APP_NAME }}.bat) so both directives
remain in the .bat header.
In @.streamlit/config.toml:
- Around line 11-12: Do not disable Streamlit's request-forgery protections:
revert enableCORS and enableXsrfProtection to true instead of false, and instead
of disabling protections configure corsAllowedOrigins to the external proxy URL
(set corsAllowedOrigins = ["https://your-proxy.example.com"]) and set
baseUrlPath if the app is served under a subpath; ensure
server.address/server.host remains appropriate (e.g., 0.0.0.0 for binding) while
keeping XSRF and CORS protections enabled for production.
In `@docker-compose.yml`:
- Around line 15-18: Persist Redis data by adding a named volume for Redis in
the docker-compose service that runs Redis (mount host volume to /var/lib/redis)
and declare the volume under top-level volumes so Redis state survives container
recreation; alternatively, modify that service's startup/entrypoint to remove
stale .job_id files from workspace paths on boot (detect and delete leftover
.job_id files) so recovered workflows don't reference missing Redis entries —
refer to the Redis data directory (/var/lib/redis) and the .job_id marker files
when making changes.
In `@gdpr_consent/src/main.ts`:
- Around line 117-126: The Matomo service object added to klaroConfig.services
(the object pushed in the if (data.args['matomo']) block) is missing a cookies
property, so Klaro cannot identify and clear Matomo cookies on consent
revocation; update the Matomo service config (the object with name: 'matomo',
purposes: ['analytics'], onAccept/onDecline callbacks) to include a cookies
array containing the appropriate patterns for Matomo cookies (e.g., regexes that
match _pk_* and _mtm* cookie names) so Klaro will remove those cookies when
analytics consent is declined or revoked.
In `@hooks/hook-analytics.py`:
- Around line 59-71: matomo_head currently injects the Matomo container script
unconditionally in matomo_head, allowing tracking before consent; change
matomo_head to defer loading by removing the immediate container <script> and
instead register a loader that only inserts the container script when consent is
signalled (e.g. when the page receives the MTMSetConsentGiven signal that
common.py pushes). Implement a small inline bootstrap in matomo_head that
defines window._mtm (so pushes can queue), exposes a loadMatomoContainer()
function, and listens for/observes the MTMSetConsentGiven event or checks a
global consent flag before programmatically creating and inserting the
container_{matomo_tag}.js script; keep identifiers matomo_head and
MTMSetConsentGiven to locate and wire the change.
In `@requirements.txt`:
- Around line 143-145: The requirements.txt currently contains floating ranges
for the Redis queue deps ("redis>=5.0.0" and "rq>=1.16.0"); change this to a
reproducible form by either (A) pinning them to exact versions (e.g., replace
"redis>=5.0.0" and "rq>=1.16.0" with specific pinned releases like
"redis==5.0.0" and "rq==1.16.0") or (B) move these declarations into your source
dependency config (e.g., pyproject.toml or requirements.in) and re-run
pip-compile so the generated requirements.txt contains fully pinned versions;
update the entries for "redis" and "rq" accordingly and recompile if you choose
option B.
- Line 80: requirements.txt pins protobuf==6.32.0 which is within the vulnerable
range; update the pin to a safe version by replacing "protobuf==6.32.0" with
"protobuf==6.33.5" (or "protobuf==5.29.6" if you must remain on 5.x), then
regenerate any lockfiles or dependency artifacts (e.g., poetry.lock, pip-compile
outputs, Docker images) and run the test suite/CI to ensure compatibility with
the new protobuf version.
In `@settings.json`:
- Around line 25-28: The new settings.json keys "queue_settings.default_timeout"
and "queue_settings.result_ttl" are never used; update QueueManager.submit_job()
to read these values from the application's configuration (or from the injected
settings object used by QueueManager) and replace the hardcoded 7200 and 86400
defaults with the configured values (falling back to 7200/86400 if the keys are
missing). Locate the submit_job method in QueueManager (and the QueueManager
constructor or initialization path) and wire in the settings value lookups for
default_timeout and result_ttl so changes in settings.json actually control job
timeout and result TTL.
In `@src/workflow/CommandExecutor.py`:
- Around line 159-165: The run_command implementation currently returns False on
non-zero exit which allows workflows (e.g., callers like run_topp) to ignore
failures; change the behavior to raise a clear exception instead of returning
False: in run_command (the block that logs "ERROR: Command failed...") raise a
RuntimeError or a CommandExecutionError constructed with the exit code,
command[0], and the collected stderr_buffer content so failures stop execution;
make the same change in the analogous failure branch noted around the other
occurrence (the block at the other failure site referenced in the review) and
ensure any callers either catch this exception where recovery is intended or let
it propagate to abort the workflow.
- Around line 39-49: The _get_max_threads method currently reads settings from
st.session_state which fails in RQ worker processes; update
CommandExecutor._get_max_threads to avoid using Streamlit and instead read
max_threads from self.parameter_manager (via get_parameters_from_json()) and/or
environment variables (falling back to the same defaults: local=4, online=2),
determine online_deployment from the same process-local config or env var, and
return max(1, int(value)); also remove or stop referencing st/session_state in
this module so tasks.py can instantiate CommandExecutor in workers without
importing Streamlit.
In `@src/workflow/health.py`:
- Around line 65-79: The health response currently always sets
"status":"healthy" even when Worker.all() returns an empty list; update the
function that builds the dict (the code that queries Worker.all() into the
workers variable and computes queue_length = len(queue)) to set "status" to
"unhealthy" (or a non-healthy value like "degraded") when workers is empty
(i.e., len(workers) == 0), so the returned payload correctly reflects an outage
when no workers exist and the queue cannot be consumed; adjust only the status
field logic while keeping the other keys (worker_count, busy_workers,
idle_workers, queue_length, workers list) intact.
In `@src/workflow/ParameterManager.py`:
- Around line 42-49: The code in ParameterManager.py uses the variable tool
directly in subprocess.call (see the ini_path logic and the
subprocess.call([tool, "-write_ini", str(ini_path)]) line); add validation
before invoking the subprocess: ensure tool matches an allowed pattern or
whitelist (e.g., only alphanumerics, hyphen, underscore, period) or verify it
points to an actual executable with shutil.which(tool) and return False if
validation fails; keep using the list form of subprocess.call (no shell=True)
and fail fast (return False) on invalid tool names to prevent command injection.
In `@src/workflow/QueueManager.py`:
- Around line 218-237: The cancel_job implementation uses Job.cancel() which
only moves jobs to Redis but does not stop workers executing them; update
cancel_job so after fetching the job (Job.fetch(..., connection=self._redis))
you detect running jobs (e.g., job.get_status() == "started" or
job.is_started()) and call send_stop_job_command(self._redis, job_id) for
running jobs and keep using job.cancel() for queued jobs; import
send_stop_job_command from rq.command and ensure you still return True on
success and preserve the existing exception handling.
In `@src/workflow/StreamlitUI.py`:
- Around line 549-604: In the StreamlitUI auto-widget branch, avoid mutating key
before delegating: save the unprefixed key (e.g., original_key = key) then
compute the prefixed key only where needed, and when calling _input_widget_impl
or st.checkbox from the "auto" dispatcher pass original_key so the prefix isn't
applied twice; update all recursive calls that currently pass key to use
original_key instead. Also remove the duplicate isinstance(value, bool) branch
that is unreachable because the bool case is already handled earlier. Ensure
calls reference the existing helpers _input_widget_impl and st.checkbox and that
prefixing happens exactly once.
In `@src/workflow/tasks.py`:
- Around line 104-119: The exception handler currently swallows errors and
returns {"success": False}, causing RQ to mark failed workflows as finished;
update the except block handling around workflow.execution() to perform the same
logging, _update_progress(job, ...), and shutil.rmtree(executor.pid_dir,
ignore_errors=True) cleanup but then re-raise the caught exception (raise)
instead of returning a dict so RQ will record the job as failed; keep the
existing success path that returns {"success": True, "workflow_dir":
str(workflow_path), "message": "Workflow completed successfully"} intact.
In `@src/workflow/WorkflowManager.py`:
- Around line 114-116: The base WorkflowManager currently treats the boolean
return of execution() strictly, breaking subclasses like TagWorkflow.execution()
that return None; update the workflow process to treat a None return as success
for backward compatibility: after calling self.execution() in the method
containing the current lines (where success = self.execution() and
self.logger.log("WORKFLOW FINISHED") is guarded), coerce None to True (i.e.,
treat None as successful completion) before checking success and logging,
referencing the execution() method and the logger.log("WORKFLOW FINISHED") call
so subclasses that still return None will trigger the finish log; alternatively
document that subclasses must now return bool in execution() if you prefer
migration.
---
Outside diff comments:
In `@src/workflow/CommandExecutor.py`:
- Around line 82-105: The worker thread wrapper run_and_track should ensure
every thread appends a result even if run_command raises; modify run_and_track
to catch Exception around self.run_command(cmd), log the exception (use
self.logger) and append False under the same lock so results length matches
threads, then re-use the existing semaphore and join logic; reference
run_and_track, self.run_command, results, lock, semaphore and the final
all(results) check to locate where to add the try/except and logging so a
thread-start failure cannot be misreported as success.
---
Nitpick comments:
In `@Dockerfile`:
- Around line 189-205: Backgrounded processes (redis-server and rq worker(s)
started using WORKER_COUNT / RQ_WORKER_COUNT) are not supervised so if any child
dies the container remains running under PID 1 (nginx/Streamlit) and the stack
is degraded; change the startup script to capture PIDs for redis-server and each
rq worker, install signal handlers to forward SIGTERM/SIGINT to those PIDs, and
run a supervisor loop that waits for any child to exit (e.g., using wait -n or
polling) and then exits non‑zero so the container fails fast; ensure the same
supervision is applied to any extra Streamlit instances and that nginx/Streamlit
(PID 1) is only exec'd after the monitor is in place or replace it by an exec of
a tiny supervisor that launches nginx/Streamlit and the background services.
In `@src/common/admin.py`:
- Around line 146-151: The catch-all except Exception block in the save demo
flow (the except following PermissionError and OSError) is too broad and should
either be narrowed or at minimum log the full traceback before returning the
user-friendly message; update the generic handler (the final except block) to
catch specific unexpected error types where possible or call
logging.exception("Unexpected error saving demo") (or use the module logger) to
record the traceback, then return the existing False, f"Unexpected error: {e!s}"
message so debugging info is preserved while keeping the user-facing string
unchanged.
In `@src/common/captcha_.py`:
- Line 248: The text_input call for the CAPTCHA uses a hardcoded max_chars=5
which can drift from the CAPTCHA length constant; update the c1.text_input
invocation (where capta2_text is defined) to use the existing length_captcha
constant instead of the literal 5 so the input widget stays in sync with the
CAPTCHA generation logic.
In `@src/common/common.py`:
- Around line 408-410: The if-statement in common.py uses an explicit equality
check against True (st.session_state.tracking_consent["matomo"] == True); change
this to a truthiness check (e.g., just use
st.session_state.tracking_consent["matomo"]) so the condition reads: if
(st.session_state.settings["analytics"]["matomo"]["enabled"]) and
(st.session_state.tracking_consent["matomo"]): — update the condition in the
block where this check appears (referenced by the existing if statement) to
follow PEP 8 truthiness style.
- Around line 758-764: The condition currently compares
st.session_state["spectrum_bin_peaks"] == True; change this to an identity check
so it specifically detects the boolean True value returned by st.selectbox
(i.e., use "is True") and leave the st.number_input and default assignment to
st.session_state["spectrum_num_bins"] unchanged; update the condition in the
block that uses st.selectbox("Bin Peaks", ["auto", True, False"]) and the
subsequent if controlling st.number_input("Number of Bins (m/z)", ...) so it
reads the identity check against True.
In `@src/workflow/ParameterManager.py`:
- Line 23: The __init__ signature in class ParameterManager uses an implicit
Optional by setting workflow_name: str = None; change it to an explicit Optional
type (e.g., workflow_name: Optional[str] or Union[str, None]) and add the
corresponding import from typing (Optional or Union) at the top of the file so
the constructor reads something like def __init__(self, workflow_dir: Path,
workflow_name: Optional[str] = None): and type-checkers will accept it.
- Around line 131-133: The bare except in ParameterManager (the JSON parameter
file load function) should be replaced with specific exception types; update the
except to catch json.JSONDecodeError (for invalid JSON) and
OSError/FileNotFoundError (for file I/O issues) instead of a blanket except,
ensure json is imported if not already, and keep the existing st.error message
and return {} inside that specific except block so only expected JSON/file
errors are handled while allowing SystemExit/KeyboardInterrupt to propagate.
In `@src/workflow/StreamlitUI.py`:
- Around line 384-385: The parameter on_change in the function signature is
typed as Callable = None which violates PEP 484; change the annotation to use
Optional[Callable] (i.e., on_change: Optional[Callable] = None) and add the
corresponding Optional import from typing (or typing.Optional) at the top of
StreamlitUI.py so the nullable callable is explicitly represented; update any
other occurrences of this pattern in the same module if present.
- Line 1327: The unpacking of status_display into label, msg_type yields an
unused msg_type; update the assignment in StreamlitUI so the unused variable is
clearly marked or removed — either unpack as label, _msg_type =
status_display.get(job_status, ("Unknown", "info")) to indicate it is
intentionally unused, or only extract the label via label =
status_display.get(job_status, ("Unknown", "info"))[0]; locate the use near the
status_display lookup in StreamlitUI and apply one of these changes (or if
msg_type was meant for styling, wire it into the display logic where the label
is rendered).
- Line 565: In the conditional in StreamlitUI.py that currently reads "elif
(isinstance(value, str) or value == None) and options is not None", replace the
equality None checks with identity checks so it becomes "elif (isinstance(value,
str) or value is None) and options is not None" (and optionally change "options
is not None" to use "is not None" if not already) to follow PEP8; locate and
update the conditional that references the variables value and options in the
relevant widget/rendering function.
In `@src/workflow/WorkflowManager.py`:
- Around line 32-35: The forward-reference string type hint 'QueueManager' on
self._queue_manager is flagged as undefined; import typing.TYPE_CHECKING and,
inside an if TYPE_CHECKING: block, import QueueManager (or its module) so the
name exists for static checkers while avoiding runtime import cost, leaving the
runtime annotation as Optional['QueueManager'] and keep the initialization in
_init_queue_manager and the WorkflowManager class unchanged.
In `@tests/test_parameter_presets.py`:
- Around line 380-390: The test iterates preset_config -> (in
tests/test_parameter_presets.py) over param_name, param_value but never
validates param_value; either add assertions to validate allowed types (e.g.,
ensure param_value is one of expected primitives/containers such as str, int,
float, bool, list, dict) inside the loop that iterates value.items(), or if the
value is intentionally unused rename param_value to _param_value to signal that
no validation is required; update the block that contains for param_name,
param_value in value.items(): accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4783ae94-82b7-4bab-8dcc-1a3aadafc0e6
⛔ Files ignored due to path filters (1)
gdpr_consent/dist/bundle.jsis excluded by!**/dist/**
📒 Files selected for processing (27)
.github/workflows/build-windows-executable-app.yaml.github/workflows/test-win-exe-w-embed-py.yaml.gitignore.streamlit/config.toml.streamlit/secrets.toml.exampleDockerfiledocker-compose.ymldocs/REDIS_QUEUE_IMPLEMENTATION_PLAN.mdexample-data/workspaces/example_demo/mzML-files/.gitkeepexample-data/workspaces/example_demo/params.jsongdpr_consent/src/main.tshooks/hook-analytics.pypresets.jsonrequirements.txtsettings.jsonsrc/common/admin.pysrc/common/captcha_.pysrc/common/common.pysrc/fileupload.pysrc/workflow/CommandExecutor.pysrc/workflow/ParameterManager.pysrc/workflow/QueueManager.pysrc/workflow/StreamlitUI.pysrc/workflow/WorkflowManager.pysrc/workflow/health.pysrc/workflow/tasks.pytests/test_parameter_presets.py
| echo '@echo off' > ${{ env.APP_NAME }}.bat | ||
| echo 'setlocal EnableDelayedExpansion' > ${{ env.APP_NAME }}.bat |
There was a problem hiding this comment.
Line 290 overwrites the batch-file header.
The second redirection recreates ${{ env.APP_NAME }}.bat, so the @echo off written on the previous line is lost. Use >> here if you want both directives to stay in the file.
💡 Minimal fix
echo '@echo off' > ${{ env.APP_NAME }}.bat
- echo 'setlocal EnableDelayedExpansion' > ${{ env.APP_NAME }}.bat
+ echo 'setlocal EnableDelayedExpansion' >> ${{ env.APP_NAME }}.bat📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| echo '@echo off' > ${{ env.APP_NAME }}.bat | |
| echo 'setlocal EnableDelayedExpansion' > ${{ env.APP_NAME }}.bat | |
| echo '@echo off' > ${{ env.APP_NAME }}.bat | |
| echo 'setlocal EnableDelayedExpansion' >> ${{ env.APP_NAME }}.bat |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/build-windows-executable-app.yaml around lines 289 - 290,
The second echo line overwrites the batch file created by the first echo (so
'@echo off' is lost); change the second redirection that writes "setlocal
EnableDelayedExpansion" to append (use >>) to the same file (referencing the
lines that write to ${{ env.APP_NAME }}.bat) so both directives remain in the
.bat header.
| enableCORS = false | ||
| enableXsrfProtection = false |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's read the .streamlit/config.toml file to see the full context
cat -n .streamlit/config.tomlRepository: OpenMS/FLASHApp
Length of output: 1035
🏁 Script executed:
# Also check if there are any comments or documentation in the file
head -30 .streamlit/config.tomlRepository: OpenMS/FLASHApp
Length of output: 697
🌐 Web query:
Streamlit enableCORS enableXsrfProtection security implications reverse proxy
💡 Result:
What these flags actually do
server.enableCORSis effectively “CORS protection” in Streamlit: when it’s true, Streamlit restricts which browser origins can call the app; when it’s false, Streamlit will allow cross-origin access broadly (e.g.,Access-Control-Allow-Origin: *for many endpoints). Streamlit enables it by default as the “most secure posture.” [1][2]server.enableXsrfProtectionenables CSRF/XSRF protection by issuing an_xsrftoken cookie and expecting the frontend to echo that token back, so the backend can tell requests are coming from the Streamlit frontend. [2]
The key security/behavior constraint (important behind proxies too)
- You can’t safely mix “allow all origins” with CSRF protection. Streamlit treats:
enableCORS=falseincompatible withenableXsrfProtection=true, and will override CORS back to enabled/restrictive mode if XSRF protection is on. This is documented and also logged as a warning by Streamlit. [1][2]
Security implications of disabling them
- Disabling XSRF protection (
enableXsrfProtection=false) removes a major defense against cross-site request forgery for any state-changing HTTP endpoints your app relies on (uploads, callbacks, component endpoints, etc.). Streamlit explicitly recommends only turning these off if you understand and accept the risk. [2] - Disabling CORS protection (
enableCORS=false) allows other websites/origins to make browser-initiated requests to your Streamlit server much more freely, increasing the risk of cross-origin abuse (especially when combined with cookies/sessions). [2]
Reverse proxy guidance (what to do instead of disabling)
- In most reverse-proxy setups, you should keep both enabled (defaults) and:
- set
server.corsAllowedOriginsto the external origin(s) your users will access (your proxy’s public URL), rather than disabling CORS. [1] - if you’re serving Streamlit under a subpath (e.g.,
/app1), configureserver.baseUrlPathto match that path so the app works correctly behind the proxy (this is often confused with CORS/XSRF breakage). [1][5] - terminate TLS at the reverse proxy / load balancer (Streamlit docs recommend SSL termination there for production). [1]
- set
Special cases to be aware of
- If you configure Streamlit’s built-in authentication (
st.login/ OIDC viasecrets.toml), Streamlit will automatically enable CORS and XSRF protection, and this overridesconfig.toml. [3] - On Streamlit Community Cloud, some security-related settings (including
enableXsrfProtection) may be forced/overridden by the platform. [4]
Rule of thumb: behind a reverse proxy, treat CORS/XSRF errors as a signal that your public origin/path doesn’t match what Streamlit expects—fix with corsAllowedOrigins and/or baseUrlPath rather than turning protections off. [1][2][5]
Sources
[1] Streamlit docs — config.toml options (enableCORS, corsAllowedOrigins, enableXsrfProtection, baseUrlPath, SSL termination note) (docs.streamlit.io)
[2] Streamlit forum — explanation of how enableCORS/enableXsrfProtection work and why they conflict (discuss.streamlit.io)
[3] Streamlit docs — st.login note: auth config auto-enables CORS/XSRF and overrides config.toml (docs.streamlit.io)
[4] Streamlit docs — Community Cloud overrides/limitations including enableXsrfProtection (docs.streamlit.io)
[5] Streamlit forum — reverse proxy under subdirectory fixed via server.baseUrlPath (discuss.streamlit.io)
Citations:
- 1: https://docs.streamlit.io/develop/api-reference/configuration/config.toml
- 2: https://discuss.streamlit.io/t/when-to-use-enablecors-and-enablexsrfprotection-parameters/32075
- 3: https://docs.streamlit.io/1.42.0/develop/api-reference/user/st.login
- 4: https://docs.streamlit.io/deploy/streamlit-community-cloud/status?utm_source=openai
- 5: https://discuss.streamlit.io/t/streamlit-reverse-proxy-problem-caddy-webserver/47396
Do not disable Streamlit's request-forgery protections in production deployments.
Lines 11-12 disable both CORS and XSRF protection. Combined with Line 8 exposing the server to all interfaces (0.0.0.0), this creates a cross-site request forgery vulnerability for state-changing operations—particularly concerning given the app's file upload and admin flow capabilities. The correct approach for reverse-proxy setups is to keep these protections enabled and configure corsAllowedOrigins to your proxy's external URL and baseUrlPath if serving under a subpath, rather than disabling the protections entirely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.streamlit/config.toml around lines 11 - 12, Do not disable Streamlit's
request-forgery protections: revert enableCORS and enableXsrfProtection to true
instead of false, and instead of disabling protections configure
corsAllowedOrigins to the external proxy URL (set corsAllowedOrigins =
["https://your-proxy.example.com"]) and set baseUrlPath if the app is served
under a subpath; ensure server.address/server.host remains appropriate (e.g.,
0.0.0.0 for binding) while keeping XSRF and CORS protections enabled for
production.
| environment: | ||
| # Number of Streamlit server instances (default: 1 = no load balancer). | ||
| # Set to >1 to enable nginx load balancing across multiple Streamlit instances. | ||
| - STREAMLIT_SERVER_COUNT=1 |
There was a problem hiding this comment.
Persist Redis alongside workspaces if job recovery is expected.
This service now runs Redis locally, but the compose setup still only persists the workspace volume. Recreating the container wipes /var/lib/redis while leaving .job_id files behind, so recovered workflows can point to jobs that no longer exist. Add a Redis data volume or clear stale job IDs on startup.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docker-compose.yml` around lines 15 - 18, Persist Redis data by adding a
named volume for Redis in the docker-compose service that runs Redis (mount host
volume to /var/lib/redis) and declare the volume under top-level volumes so
Redis state survives container recreation; alternatively, modify that service's
startup/entrypoint to remove stale .job_id files from workspace paths on boot
(detect and delete leftover .job_id files) so recovered workflows don't
reference missing Redis entries — refer to the Redis data directory
(/var/lib/redis) and the .job_id marker files when making changes.
| if (data.args['matomo']) { | ||
| klaroConfig.services.push( | ||
| { | ||
| name: 'matomo', | ||
| purposes: ['analytics'], | ||
| onAccept: callback, | ||
| onDecline: callback, | ||
| } | ||
| ) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the file and the specific lines mentioned
head -n 130 gdpr_consent/src/main.ts | tail -n 40Repository: OpenMS/FLASHApp
Length of output: 1150
🏁 Script executed:
# Let's also see the Google Analytics service definition around lines 98-100
head -n 110 gdpr_consent/src/main.ts | tail -n 20Repository: OpenMS/FLASHApp
Length of output: 663
🏁 Script executed:
# Let's check if there are any other Matomo-related configurations in the codebase
rg -i "matomo" --type ts --type js -B 3 -A 3Repository: OpenMS/FLASHApp
Length of output: 8945
🏁 Script executed:
# Check for Klaro configuration patterns to understand the expected structure
rg "cookies:" gdpr_consent/src/main.ts -B 2 -A 2Repository: OpenMS/FLASHApp
Length of output: 264
🌐 Web query:
Klaro cookie clearing behavior onDecline cookies configuration
💡 Result:
Klaro doesn’t have a special onDecline hook name in its config; instead, decline/withdrawal is handled by the regular callback(consent, service), where consent === false means the user declined (or later disabled) that service. (klaro.org)
How Klaro clears cookies when a user declines
To have Klaro auto-delete cookies on decline/withdraw, you must list the cookies for each service in that service’s cookies config. When consent changes from true → false, Klaro can delete the matching cookies. (thinkinsights.net)
Example (service-level cookie deletion + decline handling):
services: [{
name: 'google-analytics',
purposes: ['analytics'],
// cookies Klaro should remove when user declines/withdraws consent
cookies: ['_ga', '_gid', '_gat', /^_gac_.*$/i],
// called on accept AND decline; decline == consent === false
callback: function (consent, service) {
if (!consent) {
// optional extra cleanup you must do yourself (e.g., vendor APIs)
}
}
}]Cookie entries can be names or regexes. (thinkinsights.net)
Important limitations (common “why didn’t it clear?” causes)
- If you don’t provide the right path/domain, Klaro may be unable to delete the cookie (JS can’t read a cookie’s path/domain, so you must specify them when they’re not default). (klaro.org)
- Klaro cannot delete cookies set on a third-party domain, or cookies marked
HttpOnly. (klaro.org)
Example with path + domain tuple:
cookies: [
[/^_pk_.*$/, '/', 'example.com']
]Citations:
- 1: https://klaro.org/docs/integration/annotated-configuration
- 2: https://thinkinsights.net/libraries/klaro/dist/
- 3: https://thinkinsights.net/libraries/klaro/dist/
- 4: https://klaro.org/docs/integration/annotated-configuration
- 5: https://klaro.org/docs/integration/annotated-configuration
- 6: https://klaro.org/docs/integration/annotated-configuration
🌐 Web query:
Klaro.js service configuration cookies property documentation
💡 Result:
In Klaro.js, each service can define a cookies property to tell Klaro which cookies belong to that service so Klaro can delete them when the user disables/revokes consent.
cookies format (per-service)
cookies is an array. Each entry can be either:
- A cookie name (string), e.g.
"piwik_ignore" - A cookie name pattern (RegExp), e.g.
/^_pk_.*$/ - A triple
[nameOrRegex, path, domain], e.g.[/^_pk_.*$/, '/', 'example.com']
Klaro’s docs note that providing path and domain is needed when cookies are set on a non-/ path or different domain, because otherwise Klaro can’t reliably delete them in JavaScript; it also can’t delete third-party-domain cookies or HTTPOnly cookies. [1]
Official documentation
The explanation and examples are in the “Integration → Annotated configuration” page (see the services[].cookies section). [1]
Sources:
[1] Klaro! Documentation — Integration: Annotated configuration (klaro.org)
Citations:
Add cookies property to Matomo service configuration so consent revocation actually clears cookies.
The Matomo service lacks the cookies property that Klaro needs to identify and remove cookies on consent revocation. Without it, cookies like _pk_* and _mtm* will persist even after a user declines or revokes analytics consent, leaving a persistent analytics record despite opt-out. Google Analytics correctly includes cookies: [/^_ga(_.*)?/] for this reason.
Proposed change
if (data.args['matomo']) {
klaroConfig.services.push(
{
name: 'matomo',
+ cookies: [
+ /^_pk_.*/,
+ /^_mtm.*/
+ ],
purposes: ['analytics'],
onAccept: callback,
onDecline: callback,
}
)
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@gdpr_consent/src/main.ts` around lines 117 - 126, The Matomo service object
added to klaroConfig.services (the object pushed in the if (data.args['matomo'])
block) is missing a cookies property, so Klaro cannot identify and clear Matomo
cookies on consent revocation; update the Matomo service config (the object with
name: 'matomo', purposes: ['analytics'], onAccept/onDecline callbacks) to
include a cookies array containing the appropriate patterns for Matomo cookies
(e.g., regexes that match _pk_* and _mtm* cookie names) so Klaro will remove
those cookies when analytics consent is declined or revoked.
| def matomo_head(matomo_url, matomo_tag): | ||
| return f""" | ||
| <!-- Matomo Tag Manager --> | ||
| <script> | ||
| var _mtm = window._mtm = window._mtm || []; | ||
| _mtm.push({{'mtm.startTime': (new Date().getTime()), 'event': 'mtm.Start'}}); | ||
| (function() {{ | ||
| var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; | ||
| g.async=true; g.src='{matomo_url}/container_{matomo_tag}.js'; s.parentNode.insertBefore(g,s); | ||
| }})(); | ||
| </script> | ||
| <!-- End Matomo Tag Manager --> | ||
| """ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and examine the hook-analytics.py file
find . -name "hook-analytics.py" -type fRepository: OpenMS/FLASHApp
Length of output: 84
🏁 Script executed:
# Let's examine the hook-analytics.py file, focusing on lines 59-101
cat -n ./hooks/hook-analytics.py | sed -n '50,110p'Repository: OpenMS/FLASHApp
Length of output: 3131
🏁 Script executed:
# Let's look at the broader context - examine the full file to see how patch_head works
cat -n ./hooks/hook-analytics.py | head -60Repository: OpenMS/FLASHApp
Length of output: 3648
🏁 Script executed:
# Search for gdpr_consent and Klaro references
find . -type d -name "gdpr_consent" 2>/dev/null
find . -type f -name "*.ts" -o -name "*.js" | head -20
rg -l "klaro|consent" --type js --type ts | head -15Repository: OpenMS/FLASHApp
Length of output: 261
🏁 Script executed:
# Examine the consent management code
cat -n ./gdpr_consent/src/main.tsRepository: OpenMS/FLASHApp
Length of output: 5409
🏁 Script executed:
# Let's check if there's any client-side gating of Matomo tracking
# Search for matomo references and how it's managed at runtime
rg -i "matomo" --type py -A 3 -B 3 | head -100Repository: OpenMS/FLASHApp
Length of output: 3271
🏁 Script executed:
# Let's examine the common.py file to see the full context of how matomo tracking is gated
cat -n ./src/common/common.py | sed -n '400,420p'Repository: OpenMS/FLASHApp
Length of output: 990
🏁 Script executed:
# Let's look at the full captcha_.py to understand the complete consent flow
cat -n ./src/common/captcha_.py | sed -n '200,240p'Repository: OpenMS/FLASHApp
Length of output: 2172
🏁 Script executed:
# Let's understand the complete flow - check if Google Analytics has consent checks
cat -n ./hooks/hook-analytics.py | sed -n '11,34p'Repository: OpenMS/FLASHApp
Length of output: 1292
Matomo Tag Manager loads before consent is obtained.
The Matomo container script (lines 59-71, injected at lines 98-101) is loaded unconditionally at page bootstrap based solely on configuration, without any consent safeguards. In contrast, Google Analytics (lines 18-25) sets default consent to 'denied' to prevent tracking before the user confirms their choice.
Matomo's consent flow relies on Klaro (rendered at runtime during the CAPTCHA step) asking the user for approval, and only afterwards does common.py (lines 408-410) signal consent by pushing MTMSetConsentGiven to the container. This means the Matomo Tag Manager container executes and can begin collecting data before the user has had any opportunity to decline, creating a GDPR compliance gap.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@hooks/hook-analytics.py` around lines 59 - 71, matomo_head currently injects
the Matomo container script unconditionally in matomo_head, allowing tracking
before consent; change matomo_head to defer loading by removing the immediate
container <script> and instead register a loader that only inserts the container
script when consent is signalled (e.g. when the page receives the
MTMSetConsentGiven signal that common.py pushes). Implement a small inline
bootstrap in matomo_head that defines window._mtm (so pushes can queue), exposes
a loadMatomoContainer() function, and listens for/observes the
MTMSetConsentGiven event or checks a global consent flag before programmatically
creating and inserting the container_{matomo_tag}.js script; keep identifiers
matomo_head and MTMSetConsentGiven to locate and wire the change.
| ini_path = Path(self.ini_dir, tool + ".ini") | ||
| if ini_path.exists(): | ||
| return True | ||
| try: | ||
| subprocess.call([tool, "-write_ini", str(ini_path)]) | ||
| except FileNotFoundError: | ||
| return False | ||
| return ini_path.exists() |
There was a problem hiding this comment.
Validate tool name before subprocess execution.
The tool parameter is used directly in a subprocess call. While tool names typically come from controlled sources (TOPP tool definitions), adding validation provides defense-in-depth against potential command injection if tool names are ever derived from user input.
🛡️ Proposed fix to validate tool name
+ import re
+
def create_ini(self, tool: str) -> bool:
"""
Create an ini file for a TOPP tool if it doesn't exist.
Args:
tool: Name of the TOPP tool (e.g., "CometAdapter")
Returns:
True if ini file exists (created or already existed), False if creation failed
"""
+ # Validate tool name contains only safe characters (alphanumeric and common delimiters)
+ if not re.match(r'^[A-Za-z0-9_-]+$', tool):
+ return False
+
ini_path = Path(self.ini_dir, tool + ".ini")
if ini_path.exists():
return True
try:
subprocess.call([tool, "-write_ini", str(ini_path)])
except FileNotFoundError:
return False
return ini_path.exists()🧰 Tools
🪛 Ruff (0.15.4)
[error] 46-46: subprocess call: check for execution of untrusted input
(S603)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/workflow/ParameterManager.py` around lines 42 - 49, The code in
ParameterManager.py uses the variable tool directly in subprocess.call (see the
ini_path logic and the subprocess.call([tool, "-write_ini", str(ini_path)])
line); add validation before invoking the subprocess: ensure tool matches an
allowed pattern or whitelist (e.g., only alphanumerics, hyphen, underscore,
period) or verify it points to an actual executable with shutil.which(tool) and
return False if validation fails; keep using the list form of subprocess.call
(no shell=True) and fail fast (return False) on invalid tool names to prevent
command injection.
| def cancel_job(self, job_id: str) -> bool: | ||
| """ | ||
| Cancel a queued or running job. | ||
|
|
||
| Args: | ||
| job_id: The job ID to cancel | ||
|
|
||
| Returns: | ||
| True if successfully canceled | ||
| """ | ||
| if not self.is_available: | ||
| return False | ||
|
|
||
| try: | ||
| from rq.job import Job | ||
|
|
||
| job = Job.fetch(job_id, connection=self._redis) | ||
| job.cancel() | ||
| return True | ||
| except Exception: |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
RQ Job.cancel() running job started state behavior documentation
💡 Result:
In python-rq (RQ), Job.cancel() is not a “kill a running job” API.
What Job.cancel() actually does
Job.cancel():
- sets the job’s status to
CANCELED - removes it from relevant registries / queue bookkeeping (and then adds it to
CanceledJobRegistry) so it won’t be executed later (or will stop being tracked as queued/deferred/etc.) - optionally enqueues dependents if
enqueue_dependents=True(it callsQueue.enqueue_dependents(...)) [1]
This is essentially state/registry management in Redis, not process termination.
If the job is already “started/running”
Even if the job is already in started state, cancel() still only updates registries (it removes executions from StartedJobRegistry, etc.)—it does not forcibly stop the worker’s current execution at the OS/process level [1].
How to stop a currently executing job
To actually stop a running job, RQ provides worker commands:
send_stop_job_command(redis, job_id)— tells the worker to immediately stop a currently executing job; the stopped job is sent to FailedJobRegistry [2]send_kill_horse_command(redis, worker_name)— cancels the worker’s currently executing job by killing the “horse” (ignored if the worker isn’t busy) [2]
Sources: rq.job.Job.cancel implementation [1], RQ “Workers → Sending Commands to Worker” documentation (stop job / kill horse) [2].
[1] https://sources.debian.org/src/python-rq/2.3.3-2/rq/job.py
[2] https://testrqtest.readthedocs.io/en/latest/docs/workers/
Use correct RQ API for stopping running jobs.
The docstring promises to cancel "queued or running jobs," but Job.cancel() only updates Redis state—it does not stop workers that are already executing the job. For queued jobs, Job.cancel() works as documented. For running jobs, the worker continues executing while the job is moved to CanceledJobRegistry, creating a gap between the documented behavior and actual behavior. To stop a running job, use send_stop_job_command(redis, job_id) instead, which tells the worker to immediately stop execution.
🧰 Tools
🪛 Ruff (0.15.4)
[warning] 236-236: Consider moving this statement to an else block
(TRY300)
[warning] 237-237: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/workflow/QueueManager.py` around lines 218 - 237, The cancel_job
implementation uses Job.cancel() which only moves jobs to Redis but does not
stop workers executing them; update cancel_job so after fetching the job
(Job.fetch(..., connection=self._redis)) you detect running jobs (e.g.,
job.get_status() == "started" or job.is_started()) and call
send_stop_job_command(self._redis, job_id) for running jobs and keep using
job.cancel() for queued jobs; import send_stop_job_command from rq.command and
ensure you still return True on success and preserve the existing exception
handling.
| if isinstance(value, bool): | ||
| st.checkbox(name, value=value, key=key, help=help) | ||
| st.checkbox(name, value=value, key=key, help=help, on_change=on_change) | ||
| elif isinstance(value, (int, float)): | ||
| self.input_widget( | ||
| self._input_widget_impl( | ||
| key, | ||
| value, | ||
| widget_type="number", | ||
| name=name, | ||
| help=help, | ||
| widget_type="number", | ||
| options=None, | ||
| min_value=min_value, | ||
| max_value=max_value, | ||
| step_size=step_size, | ||
| help=help, | ||
| display_file_path=False, | ||
| on_change=on_change, | ||
| ) | ||
| elif (isinstance(value, str) or value == None) and options is not None: | ||
| self.input_widget( | ||
| self._input_widget_impl( | ||
| key, | ||
| value, | ||
| widget_type="selectbox", | ||
| name=name, | ||
| options=options, | ||
| help=help, | ||
| widget_type="selectbox", | ||
| options=options, | ||
| min_value=None, | ||
| max_value=None, | ||
| step_size=1, | ||
| display_file_path=False, | ||
| on_change=on_change, | ||
| ) | ||
| elif isinstance(value, list) and options is not None: | ||
| self.input_widget( | ||
| self._input_widget_impl( | ||
| key, | ||
| value, | ||
| widget_type="multiselect", | ||
| name=name, | ||
| options=options, | ||
| help=help, | ||
| widget_type="multiselect", | ||
| options=options, | ||
| min_value=None, | ||
| max_value=None, | ||
| step_size=1, | ||
| display_file_path=False, | ||
| on_change=on_change, | ||
| ) | ||
| elif isinstance(value, bool): | ||
| self.input_widget( | ||
| key, value, widget_type="checkbox", name=name, help=help | ||
| self._input_widget_impl( | ||
| key, value, name=name, help=help, widget_type="checkbox", | ||
| options=None, min_value=None, max_value=None, step_size=1, | ||
| display_file_path=False, on_change=on_change | ||
| ) | ||
| else: | ||
| self.input_widget(key, value, widget_type="text", name=name, help=help) | ||
| self._input_widget_impl( | ||
| key, value, name=name, help=help, widget_type="text", | ||
| options=None, min_value=None, max_value=None, step_size=1, | ||
| display_file_path=False, on_change=on_change | ||
| ) |
There was a problem hiding this comment.
Recursive calls double-prefix the key; duplicate bool branch is unreachable.
Two issues in the "auto" widget type handling:
-
Double-prefixing bug: At line 461,
keyis reassigned with the prefix. The recursive calls at lines 552, 566, 580, 594, 600 pass this already-prefixed key. When the recursive call executes line 461 again, the prefix is added a second time (e.g.,"CUSTOM:CUSTOM:my_param"), causing incorrect session state keys. -
Unreachable code: Lines 593-598 handle
booltype, butisinstance(value, bool)is already checked and handled at lines 549-550. This branch can never execute.
Proposed fix
+ # Store original key before prefixing for recursive calls
+ original_key = key
key = f"{self.parameter_manager.param_prefix}{key}"
if widget_type == "text":Then update all recursive calls to use original_key:
elif isinstance(value, (int, float)):
self._input_widget_impl(
- key,
+ original_key,
value,
name=name,And remove the duplicate bool handling at lines 593-598.
🧰 Tools
🪛 Ruff (0.15.4)
[error] 565-565: Comparison to None should be cond is None
Replace with cond is None
(E711)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/workflow/StreamlitUI.py` around lines 549 - 604, In the StreamlitUI
auto-widget branch, avoid mutating key before delegating: save the unprefixed
key (e.g., original_key = key) then compute the prefixed key only where needed,
and when calling _input_widget_impl or st.checkbox from the "auto" dispatcher
pass original_key so the prefix isn't applied twice; update all recursive calls
that currently pass key to use original_key instead. Also remove the duplicate
isinstance(value, bool) branch that is unreachable because the bool case is
already handled earlier. Ensure calls reference the existing helpers
_input_widget_impl and st.checkbox and that prefixing happens exactly once.
| # Execute the workflow | ||
| workflow.execution() | ||
|
|
||
| # Log workflow completion | ||
| logger.log("WORKFLOW FINISHED") | ||
|
|
||
| _update_progress(job, 1.0, "Workflow completed") | ||
|
|
||
| # Clean up pid directory (in case it was created by accident) | ||
| shutil.rmtree(executor.pid_dir, ignore_errors=True) | ||
|
|
||
| return { | ||
| "success": True, | ||
| "workflow_dir": str(workflow_path), | ||
| "message": "Workflow completed successfully" | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
RQ Redis Queue job function exception handling status finished vs failed
💡 Result:
In Python RQ (Redis Queue), the difference between a job being marked finished vs failed comes down to what the worker sees at the end of execution:
finished: your job function returns normally (even if you caught exceptions internally). If youtry/exceptan error and don’t re-raise, RQ considers the job successful and will put it in the FinishedJobRegistry. (python-rq.org)failed: an exception escapes your job function (or the job is otherwise considered unsuccessful). RQ records exception info injob.exc_info“if job doesn’t finish successfully” and puts the job in the FailedJobRegistry. (python-rq.org)
What about “exception handlers”?
RQ “custom exception handlers” are invoked when an exception happens; they’re for adding side effects (log to DB, notify, etc.), not for changing the definition of success vs failure. Handler return values only control whether RQ continues to the next handler in the chain (False stops, True continues; None counts as True). (python-rq.org)
Practical rule
- If you want the job to be
failed(so monitoring/retries/failed registry apply), let the exception propagate (or re-raise after logging). - If you want the job to be
finished, handle the error and return normally (optionally storing your own “soft-failure” state injob.meta). (python-rq.org)
Citations:
- 1: https://python-rq.org/docs/workers/
- 2: https://python-rq.org/docs/jobs/
- 3: https://python-rq.org/docs/exceptions/
- 4: https://python-rq.org/docs/jobs/
🏁 Script executed:
cd src && find . -name "tasks.py" -path "*/workflow/*" -exec wc -l {} \;Repository: OpenMS/FLASHApp
Length of output: 82
🏁 Script executed:
cd src/workflow && sed -n '104,150p' tasks.pyRepository: OpenMS/FLASHApp
Length of output: 1456
Re-raise exceptions after cleanup to preserve RQ failed status.
Both the success path (lines 104-119) and exception handler (lines 121-148) return normally, which causes RQ to mark all jobs as finished regardless of success or failure. This breaks integration with QueueManager.get_job_info() and prevents failed workflows from appearing in RQ's FailedJobRegistry.
After logging and cleanup, re-raise the exception instead of returning {"success": False} to ensure failed workflows have the correct RQ status.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/workflow/tasks.py` around lines 104 - 119, The exception handler
currently swallows errors and returns {"success": False}, causing RQ to mark
failed workflows as finished; update the except block handling around
workflow.execution() to perform the same logging, _update_progress(job, ...),
and shutil.rmtree(executor.pid_dir, ignore_errors=True) cleanup but then
re-raise the caught exception (raise) instead of returning a dict so RQ will
record the job as failed; keep the existing success path that returns
{"success": True, "workflow_dir": str(workflow_path), "message": "Workflow
completed successfully"} intact.
| success = self.execution() | ||
| if success: | ||
| self.logger.log("WORKFLOW FINISHED") |
There was a problem hiding this comment.
Subclass compatibility: execution() return type changed from None to bool.
The base execution() now returns bool, and workflow_process() checks this to log "WORKFLOW FINISHED". However, existing subclasses like TagWorkflow.execution() (context snippet 3, line 83) return None implicitly, which evaluates to False and prevents the finish log.
Either update all subclasses to return True on success, or handle None as success for backward compatibility:
🔧 Option 1: Handle None as success (backward compatible)
success = self.execution()
- if success:
+ if success is not False: # Treat None and True as success
self.logger.log("WORKFLOW FINISHED")🔧 Option 2: Update docstring to clarify migration requirement
def execution(self) -> bool:
"""
Add your workflow steps here.
- Returns True on success, False on error.
+ Returns True on success, False on error.
+
+ Note: Subclasses should explicitly return True on success.
+ Returning None (implicit) will not log "WORKFLOW FINISHED".
"""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/workflow/WorkflowManager.py` around lines 114 - 116, The base
WorkflowManager currently treats the boolean return of execution() strictly,
breaking subclasses like TagWorkflow.execution() that return None; update the
workflow process to treat a None return as success for backward compatibility:
after calling self.execution() in the method containing the current lines (where
success = self.execution() and self.logger.log("WORKFLOW FINISHED") is guarded),
coerce None to True (i.e., treat None as successful completion) before checking
success and logging, referencing the execution() method and the
logger.log("WORKFLOW FINISHED") call so subclasses that still return None will
trigger the finish log; alternatively document that subclasses must now return
bool in execution() if you prefer migration.
Summary
This PR introduces a Redis-based job queueing system for online Docker deployments while maintaining full backward compatibility with the existing multiprocessing approach for local/offline mode. Additionally, it adds support for parameter presets and improves the UI with better widget reactivity controls.
Key Changes
Redis Queue System (Online Mode Only)
QueueManagermodule (src/workflow/QueueManager.py): Manages Redis queue operations with graceful fallback to multiprocessing when Redis is unavailabletasks.pymodule (src/workflow/tasks.py): Worker task definitions for RQ execution without Streamlit dependenciesWorkflowManager: Added conditional logic to use Redis queue in online mode while preserving existing multiprocessing for local deploymentsrqandredisto requirements.txtParameter Presets System
ParameterManager: Added preset loading and application functionality with workflow-specific preset supportpresets.json: Configuration file for workflow parameter presets with descriptionstests/test_parameter_presets.py) for preset functionalityUI Improvements
StreamlitUIto support both reactive and isolated widget rendering via@st.fragmentdecoratorreactiveparameter toinput_widget()andselect_input_file()methodsAdmin & Demo Workspace Features
admin.pymodule (src/common/admin.py): Admin utilities for password verification and demo workspace managementcommon.py: Added functions for safe workspace name validation, demo workspace discovery, and symlink-based workspace setupexample_demoworkspace with sample parametersConfiguration & Deployment
.streamlit/config.tomlwith CORS and XSRF settings.streamlit/secrets.toml.examplefor admin password configurationHealth Monitoring
health.pymodule (src/workflow/health.py): Redis and worker health check utilities for sidebar metricsImplementation Details
online_deploymentsetting and Redis availabilityTesting
https://claude.ai/code/session_011JfuSn9A9sfCB1egKiyjJR
Summary by CodeRabbit
Release Notes
New Features
Improvements