GitHub

Validate your environment variables with Zod

Published at:Published at:August, 15 2024Updated at:August, 15 2024

Zod is the most famous validation library in the TypeScript ecosystem. With Zod, you create a schema and validate your data according to the schema. Observe the schema below:

import { z } from 'zod'

const UserSchema = z.object({
  name: z.string().min(1),
  age: z.number({ coerce: true }).min(18),
  email: z.string().email(),
})

This schema can be used to validate an object as follows:

const data = {
  name: 'John Doe',
  age: 18,
  email: 'john@example.com',
}

// If there is a validation error, it throws an error
const validatedData = UserSchema.parse(data)

// If there is a validation error, it returns an error object for you to handle later
const safeValidatedData = UserSchema.safeParse(data)
// => { success: false; error: ZodError }
// => { success: true; data: 'billie' }

Zod is capable of performing various types of validations on your data, so be sure to read the documentation for more details.

Validating Environment Variables

We can use Zod to validate the values present in process.env and even process them before using the environment variables in our application. Usually, I like to create an environment.ts file, as in the example below:

import { z } from 'zod'

const environmentSchema = z.object({
  // Define the possible values for NODE_ENV, always leaving a default value:
  NODE_ENV: z.enum(['test', 'development', 'production']).default('production'),
  // Environment variables are always defined as strings. Here, convert the string to a number and set a default value:
  PORT: z.number({ coerce: true }).default(3000),
})

export const env = environmentSchema.parse(process.env)

Then, just import the variable and use it throughout my application:

import Fastify from 'fastify'
import { env } from './environment.js'

const app = Fastify({ logger: true })
app.listen({ port: env.PORT }, (err) => {
  if (err) {
    app.log.error(err)
    process.exit(1)
  }
})

Using TypeScript in Node.js projects

Published at:Published at:July, 01 2024Updated at:July, 08 2024

typescript Node.js javascript

TypeScript is tremendously helpful while developing Node.js applications. Let’s see how to configure it for a seamless development experience.

Setting up TypeScript

First, we need to install TypeScript. We can do this by running the following command:

npm i -D typescript

Next, we need to create a tsconfig.json file in the root of our project. This file will contain the TypeScript configuration for our project. Here is an example of a tsconfig.json file that I picked from Total TypeScript and added a few more things (read the code and pay attention to the comments):

{
  "compilerOptions": {
    /* Base Options: */
    "esModuleInterop": true,
    "skipLibCheck": true,
    "target": "es2022",
    "allowJs": true,
    "resolveJsonModule": true,
    "moduleDetection": "force",
    "isolatedModules": true,
    "verbatimModuleSyntax": true,

    /* Setting ~ as the alias for the src/ directory */
    "baseUrl": ".",
    "paths": {
      "~/*": ["src/*"]
    },

    /* Strictness */
    "strict": true,
    "noUncheckedIndexedAccess": true,
    "noImplicitOverride": true,

    /* If transpiling with TypeScript: */
    "module": "NodeNext",
    "outDir": "dist",
    "sourceMap": true,

    /* AND if you're building for a library: */
    "declaration": true,

    /* AND if you're building for a library in a monorepo: */
    "composite": true,
    "declarationMap": true,

    /* If NOT transpiling with TypeScript: */
    "module": "preserve",
    "noEmit": true,

    /* If your code runs in the DOM: */
    "lib": ["es2022", "dom", "dom.iterable"],

    /* If your code doesn't run in the DOM: */
    "lib": ["es2022"],
  },
  /* I'm considering all your code is in src/ */
  "include": ["src/**/*.ts"]
}

Setting up the build script

Next, we need to set up a build script that will compile our TypeScript code to JavaScript. First, install tsc-alias to handle the aliases we defined in the tsconfig.json file:

npm i -D tsc-alias

Then, you can add the build script by adding the following script to our package.json file:

{
  "scripts": {
    "build": "tsc && tsc-alias"
  }
}

Setting up the development script

Next, we need to set up a development script that will watch for changes in our TypeScript files and recompile them. Personally, I like to use tsx, as it provides a much faster development experience compared to the built-in TypeScript watcher or ts-node. First, install tsx:

npm i -D tsx

Then, you can add the dev script (in order to start the project in development mode) by adding the following script to your package.json file:

{
  "scripts": {
    "build": "tsc && tsc-alias",
    "dev": "node --import=tsx --watch ./src/index.ts"
  }
}

Yes, you won’t get typechecks while developing using tsx, but you can run npm run build for that or add a new typecheck scripts to your package.json, and run it whenever you want to check for type errors:

{
  "scripts": {
    "build": "tsc && tsc-alias",
    "dev": "node --import=tsx --watch ./src/index.ts",
    "typecheck": "tsc --noEmit"
  }
}

Should I commit node_modules directory to git?

Published at:Published at:June, 04 2024Updated at:June, 04 2024

Node.js javascript npm

TL; DR: No. Please add node_modules to your .gitignore file:

node_modules

But, why?

The node_modules directory is where your package manager (that can be npm, yarn or pnpm) will install all the project dependencies listed on your package.json. Regardless of the package manager you choose, a lockfile (package-lock.json, yarn.lock or pnpm-lock.yaml, respectelly) will be generated in the first time you install your project dependencies, describing the entire dependency tree. This way, every time you need to reinstall your project dependencies, you shall get the exact same files.

The lockfile should be commited to git, enabling the re-installation of the tree of dependencies in any other ambient, what makes unecessary to commit the node_modules directory to git (also, it cuts the size of your repository by a lot, as node_modules can consumes gigabytes of space).

Mastering Insertion Sort: A Detailed Guide

Published at:Published at:January, 26 2024Updated at:January, 26 2024

algorithms computerscience tutorial

Sorting is a fundamental operation in the field of computer science, and because of this, there are various algorithms available to solve this problem. Each one is chosen based on factors such as the number of items to sort, the degree of order already present, the computer architecture where the algorithm will be executed, the type of storage device, among others. In this article, we will explore the insertion sort algorithm, understanding its nuances, strengths, and limitations.

What is insertion sort?

Insertion sort is a comparison-based algorithm that constructs its output one element at a time. It works similarly to the method we use to sort a deck of cards: we take one card at a time, compare it with the ones we already have in our hand, place the card in the correct position, and repeat this action until we finish our deck.

It is an adaptive algorithm, meaning it is efficient for small data sets, as well as other quadratic complexity algorithms ( $O(n^2)$ ). It is simple to implement, requires a constant amount of memory, as changes in the list are made in the list itself (without the need to create a new list, which would double the use of memory), and is capable of sorting the list as it receives it.

How does insertion sort work?

Initialization: We assume that the first element of our list is already sorted. We proceed to the next element, consider it as our key, and insert it in the correct position in the sorted part of the list;

Iteration: For each item in the list (starting from the second element), we store the current item (key) and its position. Then we compare the key with the elements in the sorted part of the list (elements before the key);

Insertion: If the current element in the sorted part is greater than the key, we move that element one position up. This creates space for the new key to be inserted;

Repositioning the Key: We continue moving elements one position up until we find the correct position for the key. This position is found when we encounter an element that is less than or equal to the key or when we reach the beginning of the list;

Repeat: The process is repeated for all the elements in the list.

Implementation in JavaScript

To better understand the algorithm, let’s implement it in JavaScript:

/**
* Sorts an array of numbers using the insertion sort algorithm.
* 
* @param  {number[]}  numbers - The list of numbers to be sorted.
* @returns  {number[]} - The sorted list of numbers.
*/
function insertionSort(numbers) {
  for (let i = 1; i < numbers.length; i++) {
    const key = numbers[i]
    let j = i - 1
    
    while (j >= 0 && numbers[j] > key) {
      numbers[j  +  1] =  numbers[j]
      j--
    }

    numbers[j  +  1] = key
  }
}

Complexity Analysis

Time Complexity

Best Case (Array is already sorted): $O(n)$ . This is because the inner loop (while) is not executed at all; Average Case and Worst Case (Array is sorted in reverse order): $O(n^2)$ . In the worst case, each iteration will cause an element to be moved. This makes the algorithm inefficient for large data sets.

Space Complexity

Space Complexity: $O(1)$ . Insertion sort is an in-place algorithm; it requires a constant amount of memory space.

Introduction to algorithms

Published at:Published at:January, 25 2024Updated at:January, 25 2024

Beginners Computer Science algorithms

What is an Algorithm?

An algorithm is a precise and unambiguous specification of a sequence of computational steps that can be mechanically performed^[1]. From this, we can think of a function that receives a value or a set of values as input and returns a value or a set of values as its output.

An algorithm can be correct or incorrect. It is correct when, given its input parameters, its output is correct and, therefore, solves the computational problem for which it was developed. An incorrect algorithm, on the other hand, may stop with an incorrect output or may not stop at all for some input instances. Still, some incorrect algorithms can still have useful applications.

There can be different algorithms that solve the same problem, some more efficient, that is, faster than others. But, not every problem has an efficient solution. Such problems are known as NP-complete.

NP-complete problems are very interesting: even though no efficient algorithm has been found for this class of problems, it has not been proven that it is not possible to find an efficient algorithm (from class P, which can be solved in polynomial time) for such a problem. Moreover, if there were an efficient algorithm to solve an NP-complete problem, it would mean that there is an efficient algorithm for all NP-complete problems.

P vs. NP

P vs. NP is a fundamental question in computer science, specifically in the field of computational complexity theory. It concerns the relationship between two classes of problems. The P class consists of decision problems (problems with a yes or no answer) that can be quickly solved (in polynomial time) by a deterministic computer, meaning that the time needed to solve the problem grows at a manageable rate as the size of the input increases. On the other hand, the NP class consists of decision problems for which, if a solution is provided, it can be quickly verified (also in polynomial time) by a deterministic computer. The crucial question, “Is P equal to NP?”, asks whether every problem whose solution can be quickly verified (NP) can also be solved quickly (P). This is profound because, if P were equal to NP, it would mean that all the problems that we can verify quickly can also be solved quickly. This has vast implications for various fields, including cryptography, optimization, and algorithm design.

Algorithmic Complexity

When we talk about algorithms, most of the time we are interested in the growth rate of time and space required to solve increasingly larger instances of certain problems. If we are interested in the time a particular algorithm takes to perform its function, we are interested in its time complexity. And the behavior of the time complexity limit of our algorithm in relation to the increase of the problem instances is called asymptotic time complexity. And it is this asymptotic complexity that determines the size of the problem that can be solved by algorithms^[2].

If an algorithm takes a time $cn^2$ for a constant $c$ to process an input of size $n$ , we say that the complexity of the algorithm is of the order of $n^2$ , or, in Bachmann–Landau notation (also called asymptotic notation and Big O notation), the algorithm has complexity $O(n^2)$ .

To get a better idea of what this means in relation to the runtime of our algorithm, consider that one unit of time on the computer on which we run our algorithm is 1 millisecond. Now, we want to know what the maximum size of input that our algorithm can process within a certain time limit (one second, one hour, and one day). Note, in the table below, how much the complexity of the algorithm interferes with the maximum size of the input it can handle, given the time limit:

Time Complexity	1 second	1 minute	1 hour
$n$	1000	60000	360000
$n \log_2 n$	140	4895	204095
$n^2$	31	244	1897
$n^3$	10	39	153
$2^n$	9	15	21

Even though we can build faster computers, the increase in the execution speed of less efficient algorithms would not be so significant, so we should always seek the most efficient algorithm to address a given problem.

AHO, Alfred V.; ULLMAN, Jeffrey D. Foundations of Computer Science. Stanford, 1994. ↩︎
AHO, Alfred V.; HOPCROFT, John E.; ULLMAN, Jeffrey D. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974. ↩︎

Rendering JSX on the Server with Fastify

Published at:Published at:January, 25 2024Updated at:January, 25 2024

Fastify Tips javascript Node.js

JSX is an excellent abstraction for building web interfaces. Introduced by Facebook and popularized by React, it’s an extension of JavaScript designed to abstract nested function calls. It’s expected that JSX code will be pre-processed (transpiled) into valid JavaScript before being executed in browsers or environments like Node.js.

Project Setup

First of all, let’s start our project and install the necessary dependencies:

npm init -y
npm i fastify react react-dom
npm i -D @types/node @types/react @types/react-dom tsx typescript

Now, we set up the scripts for our project. The package.json should look like this:

{
  "type":  "module",
  "name":  "fastify-react",
  "version":  "1.0.0",
  "author":  "Douglas Moura <douglas.ademoura@gmail.com>",
  "description":  "POC on rendering React components from Fastify",
  "main":  "dist/main.js",
  "scripts": {
    "start":  "tsc && node dist/main.js",
    "dev":  "tsx --watch src/main.tsx",
    "build":  "tsc"
  },
  "license":  "ISC",
  "dependencies": {
    "fastify":  "^4.25.2",
    "react":  "^18.2.0",
    "react-dom":  "^18.2.0"
  },
  "devDependencies": {
    "@types/node":  "^20.11.6",
    "@types/react":  "^18.2.48",
    "@types/react-dom":  "^18.2.18",
    "tsx":  "^4.7.0",
    "typescript":  "^5.3.3"
  }
}

And this is the tsconfig.json that we will use:

{
  "compilerOptions": {
    "target":  "ESNext",
    "module":  "ESNext",
    "lib": [
      "dom",
      "es6",
      "es2017",
      "esnext.asynciterable"
    ],
    "skipLibCheck":  true,
    "sourceMap":  false,
    "outDir":  "./dist",
    "moduleResolution":  "node",
    "removeComments":  true,
    "noImplicitAny":  true,
    "strictNullChecks":  true,
    "strictFunctionTypes":  true,
    "noImplicitThis":  true,
    "noUnusedLocals":  true,
    "noUnusedParameters":  true,
    "noImplicitReturns":  true,
    "noFallthroughCasesInSwitch":  true,
    "allowSyntheticDefaultImports":  true,
    "esModuleInterop":  true,
    "emitDecoratorMetadata":  true,
    "experimentalDecorators":  true,
    "resolveJsonModule":  true,
    "baseUrl":  ".",
    "jsx":  "react-jsx"
  },
  "exclude": [
    "node_modules"
  ],
  "include": [
    "./src/**/*.ts",
    "./src/**/*.tsx"
  ]
}

Creating our components

The React ecosystem already provides the necessary tools for rendering our components to HTML and sending them directly from the server to our client. So, first, let’s create the root component:

// src/components/root.tsx

type  RootProps  = {
  children:  React.ReactNode
  title:  string
}

export  function  Root({ children, title }:  RootProps) {
  return (
    <html  lang="en">
      <head>
        <meta  charSet="utf-8"  />
        <meta  name="viewport"  content="width=device-width, initial-scale=1"  />
        <title>{title}</title>
      </head>
      <body>{children}</body>
    </html>
  )
}

And our home page:

// src/components/index.tsx

export  function  App() {
  return (
    <h1>Hello, World!</h1>
  )
}

Configuring Fastify to Render Our React Component

As we don’t intend to load React to hydrate our HTML on the client side, we can use the renderToStaticMarkup function exported from react-dom/server. Our server initialization file will look like this:

import Fastify from  'fastify'
import { renderToStaticMarkup } from  'react-dom/server'

import { App } from  './components/index.js'
import { Root } from  './components/root.js'

type  RenderArgs  = {
  children:  React.ReactNode
  title:  string
}

const  render  = ({ title, children }:  RenderArgs) => {
  return `<!DOCTYPE html>${renderToStaticMarkup(
    <Root  title={title}>{children}</Root>
  )}`
}

const fastify =  Fastify({
  logger:  true,
})

fastify.get('/', async  function  handler(_request, reply) {
  reply.type('text/html')
  return  render({ children:  <App  />, title:  'Hello, World!' })
})

try {
  await fastify.listen({ port:  3000 })
} catch (err) {
  fastify.log.error(err)
  process.exit(1)
}

If you start the project now (npm run dev), you should see the page at http://localhost:3000. Of course, we can enhance our implementation by using the new streaming API, introduced in React 18 (which is the recommended method). To do that, we will make the following changes to our code:

import Fastify from  'fastify'
import { renderToStaticNodeStream } from  'react-dom/server'
import { Transform } from  'node:stream'

import { App } from  './components/index.js'
import { Root } from  './components/root.js'

type  RenderArgs  = {
  children:  React.ReactNode
  title:  string
}

const  render  = ({ title, children }:  RenderArgs) => {
  let  isFirstChunk  =  true
  const  prepend  =  new  Transform({
    transform(chunk, _encoding, callback) {
      if (isFirstChunk) {
        isFirstChunk  =  false
        this.push('<!DOCTYPE html>')
      }
      callback(null, chunk)
    },
  })

  return  renderToStaticNodeStream(
    <Root  title={title}>{children}</Root>
  ).pipe(prepend)
}

const  fastify  =  Fastify({
  logger:  true,
})

fastify.get('/', async  function  handler(_request, reply) {
  const  stream  =  render({ children:  <App  />, title:  'Hello, World!' })

  reply.type('text/html')
  reply.send(stream)
})

try {
  await  fastify.listen({ port:  3000 })
} catch (err) {
  fastify.log.error(err)
  process.exit(1)
}

And with that, we are able to render our React components on the server side and stream them to our client. Here is the link to the repository.

The Powerful Programmer

Published at:Published at:January, 23 2024Updated at:January, 25 2024

Software Engineering Patterns Carreer

Estimating, implementing, and deploying software rapidly is a characteristic of powerful programmers, as Kent Beck mentions in his book Extreme Programming Explained. In this article, I will explore these three points, inserting my own opinions on each one.

Estimation

Estimating a software project is difficult, and there are various different techniques on how to estimate a software project. You can create a method, through your own experience, learn the method used by other companies but, you must pay attention that the central point is that you have a good idea of how much time the project will take. Projects have a beginning, middle, and end. Learn to estimate your work.

Implementation

For me, particularly, implementing is the most fun part of the project. And as with any job, we have to be pragmatic in the choice of language and tools. Being pragmatic in the choice does not necessarily mean using the same as everyone else, because, often, some tools continue to be strong in the market due to pure inertia. Express is a good example of this. Besides there being many better options with better support (like Fastify), many teams still start new projects with Express, even if it is not being maintained regularly, does not handle Promise rejections, etc.

Besides the issue of tool choice, it is necessary that you master the technology stack of your project, being able to implement the best solutions in the shortest possible time. At the tip of your tongue, you have to know a good pattern to apply in the project, a good backend framework, a good frontend framework or even a good full-stack framework. And the experience of development cannot be left out. For the implementation to be fast, the understanding of the project must be easy, its documentation adequate, and its tests need to validate the intention flows of the user who will use the system.

Deployment

Today, can you build an entire project and put it into production, by yourself? And I’m not talking about uploading your project to a completely managed platform, like Vercel, but rather, about taking a Linux machine, installing the necessary tools, and exposing your application to the web. And no, this is not any kind of purism. If you are not a startup that can burn a few million reais per year, without worrying about the cost of your infrastructure, you should, at least, know how to start your application and keep it active between server restarts (preferably using containers), put it behind a reverse proxy (like NGINX or Caddy), configure a firewall, and make a backup of the database in three different places. You can still bring up multiple instances of your application and use the same proxy tool as a load balancer to distribute your application’s access to the different instances that are running.

Generating MD5 hashes on Node.js

Published at:Published at:January, 23 2024Updated at:January, 23 2024

Node.js javascript

You can create hashes in Node.js without the need to install any external library. Usually, I create the following utility function in the projects I work on:

/**
 * Hashes a string using md5
 *
 * @param {string} str
 * @returns {string}
 */
export const md5 = (str) => createHash('md5').update(str).digest('hex')

And I use it to replace the md5 library whenever I come across it.

Note that you can create hashes for any algorithm supported by the OpenSSL version on your platform. On Linux and Mac, you can see which algorithms are available with the command openssl list -digest-algorithms.

Finding the greatest common divisor in TypeScript

Published at:Published at:January, 12 2024Updated at:January, 25 2024

typescript mathematic javascript

First described in the classic geometry book Elements, by the ancient Greek mathematician Euclid (ca. 300 BC, at the book VII, proposition 2), the method of finding de greatest common divisor between the positive numbers $a$ and $b$ , being $a > b$ consists on the knowledge that the common divisors of $a$ and $b$ are the same of $a - b$ and $b$ . Therefore, we can find this greatest common divisor by replacing the largest number ( $a$ ) by the different between the two numbers ( $a - b$ ), repeatedly, until the two numbers are equal. In TypeScript, we can do that like this:

const gcd = (a: number, b: number): number => {
  // When `a` is equal to `b`, return the result
  if (a === b) {
    return a
  }

  // When `a` is bigger than b, calculate the the GCD again
  // with the new `a` being `a - b`.
  if (a > b) {
    return gcd(a - b, b)
  }

  // If the new `b` is bigger than `a`,
  // subtract a from it.
  return gcd(a, b - a)
}

This method can be very slow if the difference between $a$ and $b$ is too large. Thankfully, there’s another method to find the greatest common divisor between two numbers, that can be described as follows:

In order to find the greatest common divisor between $a$ and $b$ , being $a > b$ , perform the division between the two numbers. This operation will give a quotient and a remainder (that we will call $q$ and $r$ , respectively). Thus, $a$ can be described as $a = q \times b + r$ ;
If $r$ is equal to 0, we stop, because we found that the greatest common divisor of $a$ and $b$ is $b$ . Otherwise, we go back to step 1, making $b$ the new $a$ and $r$ will be the new $b$ .

Now, we can start with the implementation of the algorithm above:

const gcd = (a: number, b: number): number => {
  // First, we take the remainder between the division of a and b:
  const r = a % b
  
  // If the remainder is equal to zero, it means that we already found the
  // greatest common divisor, therefore, we return b:
  if (r === 0) {
    return b
  }
  
  // If the remainder is not equal to 0, we call the function again
  // with the new values for a and b:
  return gcd(b, a % b)
}

The implementation is very straightforward and can be read exactly as is described in steps 1 and 2. We can make the function simpler, by checking, directly, if $b$ is equal to zero and only doing the remainder operation afterwards. Therefore, if the function receive a $b$ that is equal to zero, we will know that $a$ is the greatest common divisor:

const gcd = (a: number, b: number): number => {
  if (b === 0) {
    return a
  }
  
  return gcd(b, a % b)
}

This variant is called Euclidean algorithm (in contrast of the first one, which is the Euclid’s algorithm) and it significantly faster than the first implementation.

Alternative implementations

We can also take a different approach. Instead of calling our gcd function recursively, we can implement our function using a while loop (analogous to our first implementation above):

const gcd = (a: number, b: number): number => {
  // Run this loop while a is different of b
  while (a !== b) {
    if (a > b) {
      // Subtracts b from a while a is bigger than b
      a = a - b
      // Go to the next loop
      continue
    }
    // Subtracts a from b when a <= b
    b = b - a
  }
  // Returns the greatest common divisor between a and b
  return a
}

And this is another way (analogous to our second implementation above):

const gcd = (a: number, b: number): number => {
  // Run this loop while `b` is different from 0
  while (b !== 0) {
    // Save the new value for a in a temporary variable
    const newA = b
    // Set b to the modulo of a and b (the remainder of the division between a and b)
    b = a % b
    // Set a to its new value before the next loop
    a = newA
  }
  
  // Now that b is equal to 0, we know that a is the greatest common divisor
  return a
}

Finding the greatest common between three or more numbers

The greatest of three or more numbers is equal the product of the prime factors common to all the numbers (we will explore more of that in a future article), but, you can also calculate the greatest common divisor between pairs of this list of numbers with the same functions we have showed already. So, let’s refactor our gcd function to receive multiple parameters:

const gcd = (...numbers: number[]): number => {
  const calculateGcd = (a: number, b: number): number => {
    if (b === 0) {
      return a
    }

    return calculateGcd(b, a % b)
  };

  return (
    numbers
      // Just to be sure, sort numbers in descendant order:
      .sort((a, b) => b - a)
      // Call `calculateGcd` for each pair in the numbers array:
      .reduce((a, b) => calculateGcd(a, b))
  )
}

Validating our input

Let’s guarantee that our functions should always receive, at least, two numbers and that all numbers must not be negative:

const gcd = (...numbers: number[]): number => {
  if (numbers.length < 2) {
    throw new Error("You must pass, at least, 2 numbers")
  }

  if (numbers.some((number) => number < 0)) {
    throw new Error("The numbers must be >= 0")
  }

  const calculateGcd = (a: number, b: number): number => {
    if (b === 0) {
      return a
    }

    return calculateGcd(b, a % b);
  };

  return (
    numbers
      // Just to be sure, sort numbers in descendant order:
      .sort((a, b) => b - a)
      // Call `calculateGcd` for each pair in the numbers array:
      .reduce((a, b) => calculateGcd(a, b))
  )
}

Understanding Tail Call Optimization With JavaScript

Published at:Published at:October, 24 2023Updated at:October, 24 2023

algorithms javascript mathematic Node.js

Consider the following function that calculates the factorial of a number:

const factorial = (n) => {
  let result = 1;

  while (n > 1) {
    result *= n;
    n--;
  }

  return result;
};

Factorial

In Mathematics, the factorial of a non-negative integer (n!) is the product of all positive integers less than or equal to n.

The function above was implemented iteratively, that is, it uses a loop to calculate the factorial of a number. However, it is possible to implement the same function recursively (that is, a function that references itself):

const factorial = (n) => {
  if (n === 0) return 1;

  return n * factorial(n - 1);
};

The result of both functions is the same, however, the iterative function is much more efficient (in JavaScript) than the recursive function. In addition, if we try to calculate the factorial of a very large number, we encounter the error RangeError: Maximum call stack size exceeded. Let’s understand why this happens and how we can improve the recursive function.

Call Stack

A call stack is a data structure that stores information about a program’s functions. When a function is called, it is added to the execution stack, as well as all the functions it calls. When a function returns, it is removed from the execution stack. Each function added to the stack is called a stack frame.

In order to understand what is happening, let’s try to represent, graphically, how the calculation of the factorial of 6 is done with the iterative function:

Now, compare it with the substitution model for calculating the factorial of 6 using the recursive function:

Note that, in the iterative function, the arrow shape is linear and we can see the state of each variable at each step. In addition, at each iteration of our loop, a calculation is performed and the variables stored in memory are updated. In the recursive function, the arrow shape is exponential and we cannot see the state of all variables in the first half of the processing. In addition, each time the function is executed, more memory needs to be used to store the resulting values of each execution.

But what does this mean? In order for JavaScript to calculate the factorial of 6 using the iterative function, the while condition is added to the stack, where its calculation is performed, the result variable is updated, and then the executed code block of the while is removed from the stack. This is done until the while condition is false, that is, until the value of n is less than or equal to 1.

In the recursive function, each call to the factorial function is added to the stack as many times as necessary until the if condition is false, that is, until the value of n is less than or equal to 1. This means that, to calculate the factorial of 6, the factorial function is added to the stack 6 times before being executed. And that’s why, when we try to calculate the factorial of a large number (100,000, for example), we encounter the error RangeError: Maximum call stack size exceeded: there is not enough space in the stack to store all the calls to the factorial function.

Introducing Tail Call Optimization

As defined by Dr. Axel Rauschmayer:

[…] whenever the last thing a function does is call another function, then this last function does not need to return to its caller. As a consequence, no information needs to be stored on the call stack and the function call is more like a goto (a jump). This type of call is called a tail call; not increasing the stack is called tail call optimization (TCO).

Now, we have discovered that our factorial calculation function is not tail recursive. But how can we make it tail recursive? With the help of another function:

const factorial = (n) => {
  return factorialHelper(n, 1);
};

const factorialHelper = (x, accumulator) => {
  if (x <= 1) {
    return accumulator;
  }

  return factorialHelper(x - 1, x * accumulator);
};

Now, our function is tail recursive: the last thing it does is call a function (and not calculate an expression, as in the first implementation). Now, let’s see the substitution model for calculating the factorial of 6 with our new factorial function:

The performance is superior to our first implementation, although it still doesn’t beat the performance of the iterative function. However, we still encounter the error RangeError: Maximum call stack size exceeded. But why does this happen? Because, despite our function being tail recursive, current versions of Node.js and browsers (with the exception of Safari) do not implement Tail Call Optimization (despite its inclusion in the EcmaScript specification since 2015).

But how will we solve this problem? With the help of another function, of course! For that, we will rely on the Trampoline pattern:

const trampoline = (fn) => {
  while (typeof fn === "function") {
    fn = fn();
  }

  return result;
};

Our trampoline function consists of a loop that invokes a function that wraps another function (what we call a thunk) until there are no more functions to execute. Let’s see how the implementation of our factorial function would look like with the Trampoline pattern:

const trampoline = (fn) => {
  while (typeof fn === "function") {
    fn = fn();
  }

  return fn;
};

const factorialHelper = (x, accumulator) => {
  if (x <= 1) {
    return accumulator;
  }

  // Now, a function returns another function
  return () => factorialHelper(x - 1, x * accumulator);
};

const factorial = (n) => {
  return trampoline(factorialHelper(n, 1));
};

And now, we can call our factorial function with a large number, without encountering the error RangeError: Maximum call stack size exceeded. Of course, depending on the factorial we want to calculate, we will encounter an Infinity, as it is a very large number (a number greater than Number.MAX_SAFE_INTEGER: 253 - ¹). In this case, we can use BigInt:

const trampoline = (fn) => {
  while (typeof fn === "function") {
    fn = fn();
  }

  return fn;
};

const factorialHelper = (x, accumulator) => {
  if (x <= 1) {
    return accumulator;
  }

  return () => factorialHelper(x - 1n, x * accumulator);
};

const factorial = (n) => {
  // Converting values to BigInt
  //-------------------------------\/----------\/
  return trampoline(factorialHelper(BigInt(n), 1n));
};

Typing our function

And finally, let’s add the necessary types to our factorial function:

type Thunk = bigint | (() => Thunk);

const trampoline = (fn: Thunk) => {
  while (typeof fn === "function") {
    fn = fn();
  }

  return fn;
};

const factorialHelper = (x: bigint, accumulator: bigint): Thunk => {
  if (x <= 1) {
    return accumulator;
  }

  return () => factorialHelper(x - 1n, x * accumulator);
};

const factorial = (n: number) => {
  return trampoline(factorialHelper(BigInt(n), 1n));
};