0 Architecture: Cluster Multi Process
TJKoury edited this page 2017-06-15 12:48:10 -04:00

THIS SECTION IS REFERRING TO AN EXPERIMENTAL FORK.


Multi-Process Model

Overview

This Node-Red fork uses the native Node-Red cluster module for multi-processing.

Design Notes

  1. A master process is spun up that reads the setting file and initializes at least 1 worker process
  2. Each worker process is running the full version of the node-red runtime, including the admin interface
  3. As per the base Node-Red configuration, http / ws nodes use the admin server
  4. The master process assigns a 'bingo' worker that is responsible for handling input / event nodes that do not have inputs. This includes nodes with persistent connections with external resources (mqtt, twitter, etc).
  5. Every message that is sent from a node goes through a message broker called in the Node.send method
  6. If there is only one worker process, all message passing works as usual
  7. If not, it evaluates the sender and target node, and routes messages accordingly. All nodes can have a .cluster property which tells Node-RED how to handle them. If it is missing, Node-RED makes an assumption based on worker type and number of inputs into the node
  8. When a new flow is created and deployed, the receiving worker emits a message to all workers to reload the flows from storage
  9. The new context system enables adding different persistence sources for contexts. The default is an in-memory datastore in the master process, which is shared by all worker processes
  10. If a context is initialized with the 'shared' parameter equal to true, the .get method on the returned object requests an optional allback, allowing for async storage.

Files

File Description
/cluster/index.js In charge of general process management. Returns a 'clusterRED' object. Has methods to initialize master and Worker processes, spawn and kill new processes, provide status on child processes.
/cluster/master.js Initialization code for worker process. Spawns workers, attaches events to master cluster process.
/cluster/masterEvents.js Attaches events to the worker process objects in the master process. Handles messages that are emitted by worker processes.
/cluster/worker.js Attaches events to the worker process objects in the master process. Handles messages that are emitted by worker processes.
cluster/workerEvents.js Attaches events to the worker processes. Handles messages emitted from the master process.
red.js Controls instantiation of admin server, api , core admin UI, and static server.
red/runtime/events.js Modification to the runtime event bus. Sends messages from Worker to Master, which are then broadcast to all other Workers. Messages from master are marked so as not to cause a rebroadcast cascade.
red/api/comms.js Enables websocket comms bus if on Master process.
red/api/index.js Enables api if on Master process.
red/red.js Initializes the runtime and launches the adminApp, nodeApp, and assigns the server property to the adminApp server if the process is Master.
red/runtime/index.js Enables logging for the Master process, sends messages that go across the runtime comm bus to the Master process for distribution.
red/runtime/nodes/Node.js Intercepts messages from nodes that are running on the Master and redirects to a Worker based on a round-robin
red/runtime/nodes/flows/index.js Restart Workers after flow modification. Brute force method ensures no lingering issues with long-running processes, bound ports, etc., but does open up the possibility of broken pipes when writing to files / sending data / etc.
red/runtime/storage/localfilesystem.js Modified to only allow Master to log.
settings.js Modified to expose the cluster module in function nodes, also provides the default setting for number of Workers.
##Issues
  • Need to write tests for cluster features
  • Need to write tests for storage features

General Notes

  • Using the Cluster module, instantiate the master process within which runs the main server instance with the admin interface. The settings file defines how many processes to spawn, with a 'max' argument spinning up one process per logical core. The master process listens for child death and respawns a set number of times as configured in the settings file.
  • Communication between the cluster master and child processes is handled by passing serialized JSON messages. All messages are defined by originating node id designation, and multi-process enabled nodes listen to the message event from the cluster master on the local cluster and call the 'send' method on the enabled node, activating the flow at that point.
  • Individual nodes that are multi-process enabled have a drop-down to define multi-process behavior
    • Node can receive a message from the identical node on another process/machine and execute flow as if request originated with that node, by listening to the process 'message' event.
    • Node can debounce request based on timed input (ws heartbeat, timer input), making sure that at least one or no more than one signal is processed during the timer duration
    • Node can debounce request based on external event (file system change, system state change) with a minimum debounce interval (e.g., disregard all messages from node with id 814577ba.7eba88 and payload /tmp/test.txt within 50 ms of the last message meeting those criterion)