Understanding Streams in Node.js

What is streams

Streams in Node.js help move data piece by piece (instead of all at once) from one place to another, preventing Out-of-Memory Errors when handling large files.

Example with Scenario

🪣 Think of it Like Buckets & Pipes

Imagine you have two buckets: 🔹 Source bucket – Full of water (data). 🔹 Destination bucket – Empty and needs to be filled.

How to Transfer Water? 1️⃣ Using a Buffer (Old Way):

Take all the water from the source into a smaller buffer bucket. Carry the buffer bucket and pour everything into the destination. Problem? If the buffer bucket is too small, it overflows. 2️⃣ Using a Stream (Efficient Way):

Instead of carrying everything at once, use a hose pipe (stream). Water (data) flows gradually from the source to the destination without overflow. 👉 In Node.js, Streams act like this hose pipe! They transfer data in small chunks, making the process fast, memory-efficient, and smooth.

🚀 Why Use Streams in Node.js?

✔ No Memory Overload – Handles large files without consuming too much memory. ✔ Faster Processing – Data flows continuously, no waiting for full file loads. ✔ Ideal for Big Files – Great for videos, logs, file uploads, and real-time data processing.

Understanding Readable Streams in Node.js

A readable stream is a way to read data bit by bit from a source and send it in small chunks to a destination. Instead of loading the entire file or dataset at once, streams process data gradually, making them efficient and memory-friendly.

🔹 How Readable Streams Work? 1️⃣ Reads data from a source (file, database, API, etc.). 2️⃣ Emits ‘data’ events – Sends data chunk by chunk to the next stage. 3️⃣ Emits an ‘end’ event when all data has been read. 4️⃣ Handles errors using the ‘error’ event.

✅ Example: Creating a Readable Stream from an Array

const { Readable } = require("stream");

const dataArray = ["Hello", "World", "Streams", "in", "Node.js"];

const readStream = new Readable({
  read() {
    this.push(dataArray.length ? dataArray.shift() + " " : null);
  },
});

readStream.on("data", (chunk) => {
  console.log("Received chunk:", chunk.toString());
});

readStream.on("end", () => {
  console.log("Stream finished reading!");
});

readStream.on("error", (err) => {
  console.error("Error:", err);
});

🔹 What Happens Here? ✔ The readStream reads words one by one and emits them as ‘data’ events. ✔ When the array is empty, it emits an ‘end’ event, signaling the stream is complete. ✔ If any issue occurs, the ‘error’ event is triggered.

📌 Types of Data Read in Streams

Streams can read data in three different modes:

🔸 Binary Mode (Default) – Reads raw binary data. 🔸 String Mode – Reads data as a string (set encoding: "utf-8"). 🔸 Object Mode – Reads entire objects instead of just strings/binary (set objectMode: true).

Example: Reading as String

const fs = require("fs");

const readStream = fs.createReadStream("file.txt", { encoding: "utf-8" });

readStream.on("data", (chunk) => {
  console.log("String chunk:", chunk);
});

🎯 Why Use Readable Streams? ✔ Handles large data without memory overload ✔ Faster processing & better performance ✔ Useful for reading files, APIs, real-time data, and logs

With readable streams, data flows efficiently instead of waiting for everything to load at once! 🚀

Understanding Writable Streams in Node.js

A writable stream is the destination for data. It receives data in chunks and writes it to a file, database, or any other output. Just like readable streams, writable streams process data gradually instead of all at once, making them efficient for handling large data.

🔹 How Writable Streams Work? 1️⃣ Receives data from a readable stream or another source. 2️⃣ Writes data in chunks using the .write() method. 3️⃣ Ends the stream with .end() when writing is complete. 4️⃣ Emits events like ‘finish’ and ‘error’ for handling results.

✅ Example: Writing Data to a File Using a Writable Stream

const fs = require("fs");

const writeStream = fs.createWriteStream("output.txt");

writeStream.write("Hello, this is the first chunk.\n");
writeStream.write("Here comes another piece of data.\n");

writeStream.end("Final chunk of data. Stream is ending.\n");

writeStream.on("finish", () => {
  console.log("Write stream finished!");
});

writeStream.on("error", (err) => {
  console.error("Error:", err);
});

🔹 What Happens Here? ✔ writeStream.write() writes data in chunks to output.txt. ✔ writeStream.end() signals the end of the writing process. ✔ The ‘finish’ event confirms when writing is complete. ✔ The ‘error’ event handles any writing issues.

📌 Common Writable Stream Methods 🔸 .write(data) – Writes a chunk of data. 🔸 .end(data?) – Closes the stream (optional final data). 🔸 .on('finish', callback) – Triggered when writing is complete. 🔸 .on('error', callback) – Catches errors during writing.

✅ Example: Piping Data from a Readable to a Writable Stream

const readStream = fs.createReadStream("input.txt");
const writeStream = fs.createWriteStream("output.txt");

readStream.pipe(writeStream);

🔥 Super Efficient! Instead of manually reading and writing chunks, the .pipe() method automatically transfers data from the readable stream to the writable stream.

🎯 Why Use Writable Streams? ✔ Handles large files efficiently ✔ Minimizes memory usage ✔ Essential for logging, saving files, and data processing

Writable streams make storing and processing data seamless, ensuring your Node.js applications run smoothly and efficiently! 🚀

Understanding Backpressure in Node.js Streams

Backpressure occurs when a writable stream cannot keep up with the speed of an incoming readable stream. This can cause performance issues, memory overload, or even data loss.

🪣 Think of It Like a Funnel & Hose Imagine a hose (readable stream) pouring water into a funnel (writable stream) connected to a bucket (destination).

🚰 What happens if you pour water too fast? 🔹 The funnel gets full and water starts spilling over. 🔹 You must pause pouring until the funnel clears.

👉 In Node.js, when a writable stream can’t handle incoming data fast enough, it creates backpressure.

✅ How Backpressure Works in Node.js? The .write() method in a writable stream returns true or false: ✔ true – Stream can accept more data. ❌ false – Stream is full; pause the readable stream.

Example: Handling Backpressure Properly

const fs = require("fs");

const readStream = fs.createReadStream("largeFile.txt");
const writeStream = fs.createWriteStream("output.txt");

readStream.on("data", (chunk) => {
  if (!writeStream.write(chunk)) {
    console.log("Backpressure detected! Pausing readStream...");
    readStream.pause();
  }
});

writeStream.on("drain", () => {
  console.log("Writable stream ready again. Resuming readStream...");
  readStream.resume();
});

readStream.on("end", () => {
  writeStream.end();
  console.log("File processing complete!");
});

🔹 What Happens Here? ✔ Listens for ‘data’ events from readStream. ✔ If writeStream.write(chunk) returns false, it pauses readStream. ✔ When the writable stream is ready again (drain event), it resumes reading.

🎯 Why is Backpressure Important? ✔ Prevents memory overload – Ensures writable streams don't get overwhelmed. ✔ Avoids data loss – Keeps data flow controlled and efficient. ✔ Optimizes performance – Helps Node.js handle large files smoothly.

Backpressure balances data flow, preventing slow writable streams from becoming a bottleneck! 🚀

Piping Streams in Node.js

Handling readable and writable streams manually requires listening to multiple events like data, drain, and end. But there's a simpler way—using the pipe() method!

🚀 What is pipe()? 🔹 The pipe() method connects a readable stream directly to a writable stream. 🔹 It automatically handles backpressure, ensuring smooth data flow. 🔹 The only thing we need to manage is error handling.

✅ Example: Copying a File Using pipe()

const fs = require("fs");

const readStream = fs.createReadStream("input.txt");
const writeStream = fs.createWriteStream("output.txt");

readStream.pipe(writeStream);

writeStream.on("finish", () => {
  console.log("File copied successfully!");
});

writeStream.on("error", (err) => {
  console.error("Error:", err);
});

🔹 What Happens Here? ✔ readStream.pipe(writeStream) automatically transfers data in chunks. ✔ No need to manually pause, resume, or listen for drain events—it’s all handled internally. ✔ If an error occurs, we catch it using .on("error").

🎯 Why Use pipe()? ✔ Less code, fewer bugs – No need to manually manage backpressure. ✔ Memory efficient – Processes data chunk by chunk instead of loading it all at once. ✔ Fast and optimized – Ideal for large file transfers, APIs, and data processing.

With pipe(), streams become super easy to use, making your Node.js applications more efficient and maintainable! 🚀

Understanding Duplex Streams in Node.js

A duplex stream is a special type of stream that is both readable and writable at the same time. It acts as a middle layer between a readable and a writable stream, allowing data to flow through it bidirectionally.

🔄 How Duplex Streams Work? ✔ Reads data from an input stream (like a readable stream). ✔ Processes or modifies the data (optional). ✔ Writes the processed data to an output stream (like a writable stream).

Duplex vs. Transform Streams 🔹 A duplex stream does not necessarily modify the data—it just passes it along. 🔹 A transform stream (a special type of duplex stream) changes the data before passing it forward.

✅ Example: Creating a Custom Duplex Stream

const { Duplex } = require("stream");

class MyDuplexStream extends Duplex {
  constructor() {
    super();
    this.data = [];
  }

  _write(chunk, encoding, callback) {
    console.log(`Writing: ${chunk.toString()}`);
    this.data.push(chunk);
    callback();
  }

  _read(size) {
    if (this.data.length === 0) {
      this.push(null); // Signal end of stream
    } else {
      this.push(this.data.shift());
    }
  }
}

const duplexStream = new MyDuplexStream();

duplexStream.write("Hello, ");
duplexStream.write("this is a duplex stream!");

duplexStream.on("data", (chunk) => {
  console.log(`Reading: ${chunk.toString()}`);
});

duplexStream.end();

🔹 What Happens Here? ✔ write() stores data in an internal array. ✔ read() retrieves and emits data chunk by chunk. ✔ Acts as both a readable and a writable stream.

🎯 When to Use Duplex Streams? ✔ Data relay – Forward data from one stream to another. ✔ Network communication – Handles bidirectional data flow, like sockets. ✔ Custom processing – Stores and processes data before passing it along.

Duplex streams power advanced pipelines in Node.js, making data handling flexible and efficient! 🚀

Understanding Transform Streams in Node.js

A transform stream is a special type of duplex stream that modifies the data as it passes through. Unlike a normal duplex stream, which just forwards data, a transform stream changes the data before writing it to the output.

🔄 How Transform Streams Work? ✔ Receives data from a readable stream. ✔ Processes and modifies the data. ✔ Writes the transformed data to a writable stream.

✅ Example: Replacing Vowels with ‘X’ in a Transform Stream

const { Transform } = require("stream");

class ReplaceVowelsStream extends Transform {
  _transform(chunk, encoding, callback) {
    const modifiedData = chunk
      .toString()
      .replace(/[aeiouAEIOU]/g, "X"); // Replace vowels with 'X'

    this.push(modifiedData); // Send modified data to writable stream
    callback();
  }
}

const transformStream = new ReplaceVowelsStream();

process.stdin.pipe(transformStream).pipe(process.stdout);

🔹 What Happens Here? ✔ process.stdin.pipe(transformStream).pipe(process.stdout);

Reads input from the console (stdin). Replaces vowels in the text using ReplaceVowelsStream. Outputs the modified text to the console (stdout). Example Output

Input:   Hello World!
Output:  HXllX WXrld!

🎯 When to Use Transform Streams? ✔ Data transformation – Modify text, compress files, encrypt/decrypt data. ✔ File processing – Convert text files, reformat JSON, CSV parsing. ✔ Real-time modifications – Modify network requests, manipulate API responses.

Transform streams simplify data processing in Node.js, making pipelines more powerful and efficient! 🚀