Techglot

Kafka Interview Questions

Questions –

How do you create a topic in Kafka using the Confluent CLI?
- Command
Explain the role of the Schema Registry in Kafka.
How do you register a new schema in the Schema Registry?
What is the importance of key-value messages in Kafka?
Describe a scenario where using a random key for messages is beneficial.
Provide an example where using a constant key for messages is necessary.
Write a simple Kafka producer code that sends JSON messages to a topic.
How do you serialize a custom object before sending it to a Kafka topic?
Describe how you can handle serialization errors in Kafka producers.
Write a Kafka consumer code that reads messages from a topic and deserializes them from JSON.
How do you handle deserialization errors in Kafka consumers?
Explain the process of deserializing messages into custom objects.
What is a consumer group in Kafka, and why is it important?
Describe a scenario where multiple consumer groups are used for a single topic.
How does Kafka ensure load balancing among consumers in a group?
How do you send JSON data to a Kafka topic and ensure it is properly serialized?
Describe the process of consuming JSON data from a Kafka topic and converting it to a usable format.
Explain how you can work with CSV data in Kafka, including serialization and deserialization.
Write a Kafka producer code snippet that sends CSV data to a topic.
Write a Kafka consumer code snippet that reads and processes CSV data from a topic.
Different way to receive and ack the kafka
What makes kafka fast?

2.Explain the role of the Schema Registry in Kafka.

The Schema Registry in Kafka plays a crucial role in managing schemas for data that is sent to and from Kafka topics.

Schema Management:

Centralized Schema Repository: The Schema Registry acts as a centralized repository for schemas used in Kafka messages. It stores and manages schemas independently from the Kafka brokers.
Schema Evolution: It facilitates schema evolution by allowing compatibility checks between different versions of schemas. This ensures that producers and consumers can evolve their schemas without causing disruptions.

Example:

Suppose a producer wants to publish messages to a Kafka topic using Avro serialization. Before sending data, it registers the Avro schema with the Schema Registry, which assigns it an ID. When the producer sends a message, it includes the schema ID alongside the serialized data. Consumers retrieve the schema ID from the message, fetch the corresponding schema from the Schema Registry, and deserialize the data accordingly.

22.What makes kafka fast?

Zero-copy writes make Kafka fast, but how exactly? ⚡

Kafka is a message broker, and it accepts messages from the network and writes to the disk, and vice versa. The traditional way of moving data from network to disk involves `read` and `write` system calls, which require data to be moved to and from user space to kernel space.

Kafka leverages `sendfile` system call which copies data from one file descriptor to another within the kernel. Kafka uses this to directly transfer data from the network socket to the file on disk, bypassing unnecessary copies.

If you are interested, just read the man page of `sendfile` system call. In most cases, whenever you see something extracting extreme performance a major chunk of it comes from leveraging the right system call.

ps: I used this zero copy while building Remote Shuffle Service for Apache Spark. It proved pivotal in getting a great performance while moving multi-tb data across machines.

⚡ Admissions for my System Design June 2024 cohort are open, if you are SDE-2 and above and looking to build a rock-solid intuition to design any and every system, check out

UBer Usecase –

https://www.linkedin.com/pulse/case-study-kafka-async-queuing-consumer-proxy-vivek-bansal-lt1pc/?trackingId=sXBYzdx7T42SFdmitvQVwQ%3D%3D

Java-21 features

Index

Language feauteres
- String Template
- record pattern
- pattern matching for switch
Libraries improvement
- Virtual threads
- Sequenced collections
Performace improvement
- ZGC

String Template[Preview]

Index

Why Introduced
syntax
- Usage Example

There are three components to a template expression:

A processor
- STR Template Processor
A template which contains the data with the embedded expressions
- e,g, “\{var_name}“
A dot (.) character

String interpolationUsingSTRProcessor(String feelsLike, String temperature, String unit) {
    return STR
      . "Today's weather is \{ feelsLike }, with a temperature of \{ temperature } degrees \{ unit }" ;
}

Record pattern

Index

Why Introduced
syntax
- Usage Example

Pattern matching for switch

Index

Why Introduced
syntax
- Usage Example

Virtual threads

Index

Why Introduced
syntax
- Usage Example

Sequenced collections

Index

Why Introduced
syntax
- Usage Example

Java-11 features

Index

var keyword – for local variable representation
- use if var keyword in lambdas
String class new method
- isEmpty()
Files class new mwthod
- createDirectories(directory/hireacrchy/to/create)
Nested class based access control
HttpClient

Java-17 features

Index

NullPointerException message enhancement
Null allowed in switch
Switch expression enhancement
- switch can come with arrow sign -> which returns a value and
- use of keyword yield to return default value in default section
- multiple cases can be separated by comma
Sealed classes
- Only permitted class can inherit
Record class
- reduced boilerplate,
- immutable and final class – they are not extensible.
- No setters
- temporarily hold immutable data i.e traditional POJO

Spring-boot

Index

Versions
Interview Questions

Versions

Version	Release Date	Major Features	Comment

3.2.3	February 22, 2024	Upgraded dependencies (Spring Framework 6.1.4, Spring Data JPA 3.1.3, Spring Security 6.2.2, etc.) https://www.codejava.net/spring-boot-tutorials
3.1.3	September 20, 2023	Enhanced developer experience, improved reactive support, and updated dependencies https://spring.io/blog/2022/05/24/preparing-for-spring-boot-3-0
3.0.x	May 2020 – December 2022	Introduced reactive programming, improved build system, and various dependency updates throughout the series (refer to official documentation for details)
2.x	March 2018 – May 2020	Introduced Spring Boot actuator, developer tools, and auto-configuration (refer to official documentation for specific features within each version)	2.7.7 used in project (switch)
1.x	April 2014 – February 2018	Initial versions focusing on simplifying Spring application development	1.5.22.RELEASE used in project (consumers)

Springboot versions and corresponding spring version support:

Spring Boot Version	Supported Spring Framework Versions
1.x	4.x
2.0.x – 2.3.x	5.x
2.4.x	5.x, 6.x
3.0.x – 3.2.x	6.x

Interview Questions

Why springboot over spring?
1. Convention-over-Configuration:
  - Spring Boot: Spring Boot follows convention-over-configuration principles, reducing the need for explicit configuration. Annotations like @Service are automatically recognized and configured based on conventions.
  - Spring (Traditional): In traditional Spring applications, while you can use annotations, you might need more explicit configuration, especially in XML-based configurations.
2. Auto-Configuration:
  - Spring Boot: Spring Boot provides auto-configuration, which means that common configurations are automatically applied based on the project’s dependencies. For example, if you have @Service annotated classes, Spring Boot will automatically configure them as Spring beans.
  - Spring (Traditional): In traditional Spring, you might need to configure components more explicitly, specifying details in XML files or Java-based configuration classes.
3. Reduced Boilerplate Code:
  - Spring Boot: Spring Boot’s defaults and starters significantly reduce boilerplate code. You can focus more on writing business logic and less on configuration.
  - Spring (Traditional): Without the conventions and defaults of Spring Boot, you might find yourself writing more configuration code to set up beans and application context.
4. Simplified Dependency Management:
  - Spring Boot: The use of starters simplifies dependency management. With the appropriate starter, you get a predefined set of dependencies, including those for services, making it easy to include and manage dependencies.
  - Spring (Traditional): While you can manage dependencies in traditional Spring, Spring Boot provides a more streamlined way to do so with starters.
5. Out-of-the-Box Features:
  - Spring Boot: Spring Boot provides out-of-the-box features, such as embedded servers, metrics, and health checks. These features are often automatically configured, making it easier to develop production-ready applications.
  - Spring (Traditional): While you can manually configure these features in traditional Spring, Spring Boot simplifies the process and encourages best practices.
6. Faster Project Bootstrap:
  - Spring Boot: With its starters and defaults, Spring Boot allows for faster project bootstrapping. You can create a fully functional application with minimal setup.
  - Spring (Traditional): Setting up a traditional Spring application might involve more manual configuration and a longer setup time.

Annotations in springboot
- @SpringbootApplication
  1. @EnableAutoconfiguration
  2. @ComponentScan
  3. @SpringBootConfiguration specialised form of @Configuration

Micro-services – DP

INdex

Design patterns

Decentralized Data Management:
- Implementation:
  1. Each microservice manages its own data and database independently.
  2. Avoids a shared database to prevent tight coupling between services.
- Java Libraries:
  1. No specific library is tied to this pattern, as it’s more of an architectural principle.
  2. The choice of databases is left to individual microservices. For example, microservices might use Spring Data JPA for database interactions.
Event-Driven Architecture:
- Implementation:
  1. Microservices communicate asynchronously through events.
  2. Events represent state changes and are used for inter-service communication.
- Java Libraries:
  1. Apache Kafka: A distributed event streaming platform.
  2. Spring Cloud Stream: Simplifies event-driven microservices development using Spring Boot and Apache Kafka.
  3. Service Discovery:

Implementation:
- Microservices dynamically discover and communicate with each other.
- Service registry and discovery mechanisms facilitate this dynamic communication.
Java Libraries:
- Netflix Eureka: A service registry for locating services in the cloud.
- Consul: A tool for service discovery and configuration.

4. API Gateway:

Implementation:
- An entry point that consolidates and manages requests to various microservices.
- Handles authentication, load balancing, and routing.
Java Libraries:
- Spring Cloud Gateway: A dynamic routing and API gateway powered by Spring WebFlux.
- Netflix Zuul: A dynamic router and filter for edge services.

5. Circuit Breaker Pattern:

Implementation:
- Protects microservices from failures in dependent services.
- Opens the circuit if a service is not responsive, preventing cascading failures.
Java Libraries:
- Netflix Hystrix: A library for adding circuit breakers to your services.
- Resilience4j: A lightweight fault tolerance library.

6. Retry Pattern:

Implementation:
- Retries failed operations to enhance system reliability.
- Helps in dealing with transient errors.
Java Libraries:
- Spring Retry: A Spring Framework project for retrying failed operations.

7. Saga Pattern:

Implementation:
- Manages long-lived transactions across multiple microservices.
- Breaks down a transaction into a sequence of smaller, more manageable steps.
- Handles compensating transactions in case of failures.
Java Libraries:
- No specific library is tied directly to the Saga Pattern, but frameworks like Axon or Eventuate provide support for implementing sagas in Java-based microservices.

Messaging

INdex

difference
version of apache kafka

Key Differences:

Active MQ vs IBM MQ / WebSphere MQ Vs Kafka

	Kafka	IBM MQ/ WebSphere MQ	ActiveMQ (Apache)
Use Case Focus	Distributed event streaming platform for handling *real-time data streams.*	Enterprise messaging system for reliable and secure communication.	General-purpose message broker for asynchronous communication.
Scalability	Built for horizontal scalability and high throughput.	Can be configured for high *availability and clustering.*	Can be horizontally scaled.
Message Persistence	Persists messages and provides fault-tolerant storage.(how?)	Emphasizes guaranteed delivery and transactional support.	Supports message persistence.
Programming Model	APIs available for multiple programming languages (Java, Scala, Python, etc.).	Supports multiple programming languages and platforms.	JMS-based, often used in Java applications.

Pages: 12

Payments Intrduction

THere are number of standards that define protocols[rules] for transactions [be it financial or non financial – ]. these standards ecusre consistency in excahneg of information between difeernt entities envlolevd(bank[acquirer,issuer,]merchant,user) . e.g one of the stadnrd that is widely used internationally ISO20022[for all paymentISO 8583 [specicically for card payment]] ,others include ACH managed by ACHA
On similar line in india we have a payment standrd called UPI[Unified payment interface] manaed by NPCI[National Payment COrportatoin of INdia]

In UPI standrd of payment, it defines set of protocols in a specific forrmal[xml with header,body etc] used by invloded parties for transacation.

sample

<upi:phase>
<header> -- version </Header>
<Meta> type of phases</Meta>
<Txn> -txn details -C|D|CR|DR|R
<Rules>- primarily for mandates- for expiry time
</Txn>
<Payer namr""> -- payer details</Payer>
<Payee> - payee ddetails </payee>
</upi:phase>
Note: std structured as elements of xmls i.e root, childs i.e header,meta,txn,payer , payee etc. and elements have attributes e.g name in payer that add additional information .

Tranasction has different phases OR is managed by different APIs and
there are protocol related to each phases \APIs- reqpay,respay , req-auth,respauth, req-paycollect , reqpay-intent, reqpay-mandate, resp-paymandate []

above mentoined protocol are used with varitoues instruments[cards,lite-payment,FIR,wallet-transaction etc.] with variation in the parameters/atributes/elemnts passed as part of standed protocol.

We can go in more deplth of each phases as folllowops posts.

Trie

Index

Definition
Time-Complexity
Programmatic representation
Operations
1. Insertion
2. Search/Auto-completion

Definition

Trie comes from word reTrieval.
Trie is a k-way tree DS.
It is optimised for retrieval where string have common prefix which are store on common nodes.

Time Complexity

It has time complexity of O(N) where N is the maximum length of string.

Programmatic representation

A Node is represented as class having 2 things –
1. children representing one of 26 characters for a node.
2. isLastNode – representing last node of string.

class TrieNode{
//representing one of 26 characters of alphabet for a node
TrieNode children;
//representing last node of string.
boolean isLastNode;
TrieNode(){
children=new TrieNode[26];
isLastNode=false;
}
}

Trie Operation and applications:

Insertion :

class TrieNode{

TrieNode children;
boolean isLastNode;

TrieNode(){
children=new TrieNode[26];
isLastNode=false;
}

}

Class Trie{
TrieNode root;
Trie(){
root=new TrieNode();
}


public static void main(){
Trie trie=new Trie;
trie.insert("parag");
trie.insert("parameter");
trie.insert("parashoot");
trie.autocomplete("para");
}
private void insert(String word){
TrieNode current=root;
}
private void autoComplete(String prefix){
TrieNode current=root;
}
}

Kafka Certifications

CCDAK – Confluent Certified Developer for Apache Kafka
CCAAK- Confluent Certified Administrator for Apache Kafka®

Application link and details :

Kakfa-Certification-link