2022-07-24

Add TypeScript Support to Vue 2 Project

Now that TypeScript has become the de facto standard in frontend development, new projects and third-party libraries are mostly built on its ecosystem. For existing projects, TypeScript can also be applied gradually. Just add the toolchain, and start writing or rewriting part of your application. In this article, I will walk you through the steps of adding TypeScript to a Vue 2 project, since I myself is working on a legacy project, and TypeScript has brought a lot of benefits.

Prerequisites

For those who are new to TypeScript, I recommend you read the guide TypeScript for JavaScript Programmers. In short, TypeScript is a superset of JavaScript. It adds type hints to variables, as well as other syntax like class, interface, decorator, and some of them are already merged into ECMAScript. When compiling, TypeScript can do static type check. It will try to infer the variable type as much as possible, or you need to define the type explicitly. Here is the official TypeScript Cheat Sheet.

TypeScript Cheat Sheet - Interface

2022-07-16

Programming

Manage Multiple CommandLineRunner in Spring Boot

In Spring Boot, the CommandLineRunner and ApplicationRunner are two utility interfaces that we can use to execute code when application is started. However, all beans that implement these interfaces will be invoked by Spring Boot, and it takes some effort to execute only a portion of them. This is especially important when you are developing a console application with multiple entry points. In this article, we will use several techniques to achieve this goal.

Put CommandLineRunner in different packages

By default, @SpringBootApplication will scan components (or beans) in current and descendant packages. When multiple CommandLineRunners are discovered, Spring will execute them all. So the first approach will be separating those runners into different packages.

package com.shzhangji.package_a;

@Slf4j
@SpringBootApplication
public class JobA implements CommandLineRunner {
  public static void main(String[] args) {
    SpringApplication.run(JobA.class, args);
  }

  @Override
  public void run(String... args) {
    log.info("Run package_a.JobA");
  }
}

If there is a JobB in package_b, these two jobs will not affect each other. But one problem is, when executing JobA, only components defined under package_a will be scanned. So if JobA wants to use a service in com.shzhangji.common package, we have to import this class explicitly:

package com.shzhangji.package_a;

import com.shzhangji.common.UserService;

@SpringBootApplication
@Import(UserService.class)
public class JobA implements CommandLineRunner {
  @Autowired
  private UserService userService;
}

If there are multiple classes or packages that you want to import, you may as well change the base packages property:

@SpringBootApplication(scanBasePackages = {
    "com.shzhangji.common",
    "com.shzhangji.package_a",
})
public class JobA implements CommandLineRunner {}

2022-07-05

Programming

Store Custom Data in Spring MVC Request Context

When developing a web application with Spring MVC, you want to make some data available throughout the current request, like authentication information, request identifier, etc. These data are injected into a request-scoped context, and destroyed after the request ends. There are several ways to achieve that, and this article will demonstrate how.

Use HttpServletRequest or WebRequest

Controller methods can delare an HttpServletRequest typed argument. When it is invoked, Spring will pass in an instance that contains information specific to the current request, like path and headers. It also provides a pair of methods that gets and sets custom attributes. For instance, Spring itself uses it to store application context, locale and theme resolver.

@RestController
public class UserController {
  @GetMapping("/info")
  public String getInfo(HttpServletRequest request) {
    Object ctx = request.getAttribute("org.springframework.web.servlet.DispatcherServlet.CONTEXT");
    return String.valueOf(ctx);
  }
}

We can certainly use it to store our own data, like in a Filter that sets the user information.

@Component
public class UserFilter extends OncePerRequestFilter {
  @Override
  protected void doFilterInternal(
      HttpServletRequest request, HttpServletResponse response, FilterChain filterChain)
      throws ServletException, IOException {

    request.setAttribute("user", new User("Jerry"));
    filterChain.doFilter(request, response);
  }
}

2022-07-01

Programming

Monitor Kubernetes Volume Storage

Pods running on Kubernetes may claim a Persistent Volume to store data that last between pod restarts. This volume is usually of limited size, so we need to monitor its storage and alert for low free space. For stateless pods, it is also necessary to monitor its disk usage, since the application within may write logs or other contents directly onto the Docker writable layer. In Kubernetes terms, this space is called ephemeral storage. Another way to prevent ephemeral storge from filling up is to monitor the nodes’ disk space directly. This article will demonstrate how to monitor volume storage with Prometheus.

Monitor Persistent Volume

kubelet exposes the following metrics for Persistent Volumes:

$ curl http://10.0.0.1:10255/metrics
# HELP kubelet_volume_stats_capacity_bytes [ALPHA] Capacity in bytes of the volume
# TYPE kubelet_volume_stats_capacity_bytes gauge
kubelet_volume_stats_capacity_bytes{namespace="airflow",persistentvolumeclaim="data-airflow2-postgresql-0"} 4.214145024e+10
kubelet_volume_stats_capacity_bytes{namespace="default",persistentvolumeclaim="grafana"} 2.1003583488e+10

# HELP kubelet_volume_stats_used_bytes [ALPHA] Number of used bytes in the volume
# TYPE kubelet_volume_stats_used_bytes gauge
kubelet_volume_stats_used_bytes{namespace="airflow",persistentvolumeclaim="data-airflow2-postgresql-0"} 4.086779904e+09
kubelet_volume_stats_used_bytes{namespace="default",persistentvolumeclaim="grafana"} 4.9381376e+07

After you setup the Prometheus Stack with Helm chart, you will get a Service and ServiceMonitor that help scraping these metrics. Then they can be queried in Prometheus UI:

Prometheus UI

2022-06-26

Programming

Write Your Own Flask SQLAlchemy Extension

When it comes to connecting to database in Flask project, we tend to use the Flask-SQLAlchemy extension that handles the lifecycle of database connection, add a certain of utilities for defining models and executing queries, and integrate well with the Flask framework. However, if you are developing a rather simple project with Flask and SQLAlchemy, and do not want to depend on another third-party library, or you prefer using SQLAlchemy directly, making the model layer agnostic of web frameworks, you can write your own extension. Besides, you will gain better type hints for SQLAlchemy model, and possibly easier migration to SQLAlchemy 2.x. This article will show you how to integrate SQLAlchemy 1.4 with Flask 2.1.

The alpha version

In the official document Flask Extension Development, it shows us writing a sqlite3 extension that plays well with Flask application context. So our first try is to replace sqlite3 with SQLAlchemy:

from typing import Optional

from flask import Flask, current_app
from flask.globals import _app_ctx_stack, _app_ctx_err_msg
from sqlalchemy import create_engine
from sqlalchemy.engine import Engine


class SQLAlchemyAlpha:
    def __init__(self, app: Optional[Flask] = None):
        self.app = app
        if app is not None:
            self.init_app(app)

    def init_app(self, app: Flask):
        app.config.setdefault('SQLALCHEMY_DATABASE_URI', 'sqlite://')
        app.teardown_appcontext(self.teardown)

    def connect(self) -> Engine:
        return create_engine(current_app.config['SQLALCHEMY_DATABASE_URI'])

    def teardown(self, exception) -> None:
        ctx = _app_ctx_stack.top
        if hasattr(ctx, 'sqlalchemy'):
            ctx.sqlalchemy.dispose()

    @property
    def engine(self) -> Engine:
        ctx = _app_ctx_stack.top
        if ctx is not None:
            if not hasattr(ctx, 'sqlalchemy'):
                ctx.sqlalchemy = self.connect()
            return ctx.sqlalchemy
        raise RuntimeError(_app_ctx_err_msg)

2022-06-19

Programming

OpenAPI Workflow with Flask and TypeScript

OpenAPI has become the de facto standard of designing web APIs, and there are numerous tools developed around its ecosystem. In this article, I will demonstrate the workflow of using OpenAPI in both backend and frontend projects.

OpenAPI 3.0

API Server

There are code first and design first approaches when using OpenAPI, and here we go with code first approach, i.e. writing the API server first, add specification to the method docs, then generate the final OpenAPI specification. The API server will be developed with Python Flask framework and apispec library with marshmallow extension. Let’s first install the dependencies:

Flask==2.1.2
Flask-Cors==3.0.10
Flask-SQLAlchemy==2.5.1
SQLAlchemy==1.4.36
python-dotenv==0.20.0
apispec[marshmallow]==5.2.2
apispec-webframeworks==0.5.2

2022-06-11

Programming

Use Bootstrap V5 in Vue 3 Project

Bootstrap V5 and Vue 3.x have been released for a while, but the widely used BootstrapVue library is still based on Bootstrap V4 and Vue 2.x. A new version of BootstrapVue is under development, and there is an alternative project BootstrapVue 3 in alpha version. However, since Bootstrap is mainly a CSS framework, and it has dropped jQuery dependency in V5, it is not that difficult to integrate into a Vue 3.x project on your own. In this article, we will go through the steps of creating such a project.

Create Vite project

The recommended way of using Vue 3.x is with Vite. Install yarn and create from the vue-ts template:

yarn create vite bootstrap-vue3 --template vue-ts
cd bootstrap-vue3
yarn install
yarn dev

Add Bootstrap dependencies

Bootstrap is published on npm, and it has an extra dependency Popper, so let’s install them both:

1	yarn add bootstrap @popperjs/core

You may also need the type definitions:

1	yarn add -D @types/bootstrap

Use Bootstrap CSS

Just add a line to your App.vue file and you are free to use Bootstrap CSS:

<script setup lang="ts">
import 'bootstrap/dist/css/bootstrap.min.css'
</script>

<template>
  <button type="button" class="btn btn-primary">Primary</button>
</template>

You can also use Sass for further customization.

2022-06-03

Programming

Migrate from hexo-deployer-git to GitHub Actions

TL;DR

Create .github/workflows/pages.yml in your master branch:

name: Update gh-pages

on:
  push:
    branches:
      - master

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-node@v3
        with:
          node-version: "12.22"
          cache: yarn
      - run: yarn install
      - run: yarn build
      - uses: peaceiris/actions-gh-pages@v3
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          publish_dir: ./public

Go to GitHub repo’s Settings > Pages, change source branch to gh-pages.

How it works

2019-08-24

Big Data

Deploy Flink Job Cluster on Kubernetes

Kubernetes is the trending container orchestration system that can be used to host various applications from web services to data processing jobs. Applications are packaged in self-contained, yet light-weight containers, and we declare how they should be deployed, how they scale, and how they expose as services. Flink is also a trending distributed computing framework that can run on a variety of platforms, including Kubernetes. Combining them will bring us robust and scalable deployments of data processing jobs, and more safely Flink can share a Kubernetes cluster with other services.

Flink on Kubernetes

When deploying Flink on Kubernetes, there are two options, session cluster and job cluster. Session cluster is like running a standalone Flink cluster on k8s that can accept multiple jobs and is suitable for short running tasks or ad-hoc queries. Job cluster, on the other hand, deploys a full set of Flink cluster for each individual job. We build container image for each job, and provide it with dedicated resources, so that jobs have less chance interfering with other, and can scale out independently. So this article will illustrate how to run a Flink job cluster on Kubernetes, the steps are:

Compile and package the Flink job jar.
Build a Docker image containing the Flink runtime and the job jar.
Create a Kubernetes Job for Flink JobManager.
Create a Kubernetes Service for this Job.
Create a Kubernetes Deployment for Flink TaskManagers.
Enable Flink JobManager HA with ZooKeeper.
Correctly stop and resume Flink job with SavePoint facility.

2019-06-10

Big Data

Understanding Hive ACID Transactional Table

Apache Hive introduced transactions since version 0.13 to fully support ACID semantics on Hive table, including INSERT/UPDATE/DELETE/MERGE statements, streaming data ingestion, etc. In Hive 3.0, this feature is further improved by optimizing the underlying data file structure, reducing constraints on table scheme, and supporting predicate push down and vectorized query. Examples and setup can be found on Hive wiki and other tutorials, while this article will focus on how transactional table is saved on HDFS, and take a closer look at the read-write process.

File Structure

Insert Data

CREATE TABLE employee (id int, name string, salary int)
STORED AS ORC TBLPROPERTIES ('transactional' = 'true');

INSERT INTO employee VALUES
(1, 'Jerry', 5000),
(2, 'Tom',   8000),
(3, 'Kate',  6000);

An INSERT statement is executed in a single transaction. It will create a delta directory containing information about this transaction and its data.

1
2
3

/user/hive/warehouse/employee/delta_0000001_0000001_0000
/user/hive/warehouse/employee/delta_0000001_0000001_0000/_orc_acid_version
/user/hive/warehouse/employee/delta_0000001_0000001_0000/bucket_00000

The schema of this folder’s name is delta_minWID_maxWID_stmtID, i.e. “delta” prefix, transactional writes’ range (minimum and maximum write ID), and statement ID. In detail:

All INSERT statements will create a delta directory. UPDATE statement will also create delta directory right after a delete directory. delete directory is prefixed with “delete_delta”.
Hive will assign a globally unique ID for every transaction, both read and write. For transactional writes like INSERT and DELETE, it will also assign a table-wise unique ID, a.k.a. a write ID. The write ID range will be encoded in the delta and delete directory names.
Statement ID is used when multiple writes into the same table happen in one transaction.

Ji Zhang's Blog

If I rest, I rust.

Add TypeScript Support to Vue 2 Project

Prerequisites

Manage Multiple CommandLineRunner in Spring Boot

Put CommandLineRunner in different packages

Store Custom Data in Spring MVC Request Context

Use HttpServletRequest or WebRequest

Monitor Kubernetes Volume Storage

Monitor Persistent Volume

Write Your Own Flask SQLAlchemy Extension

The alpha version

OpenAPI Workflow with Flask and TypeScript

API Server

Use Bootstrap V5 in Vue 3 Project

Create Vite project

Add Bootstrap dependencies

Use Bootstrap CSS

Migrate from hexo-deployer-git to GitHub Actions

TL;DR

How it works

Deploy Flink Job Cluster on Kubernetes

Understanding Hive ACID Transactional Table

File Structure

Insert Data