J2EE – Techglot

Kafka Interview Questions

Questions –

How do you create a topic in Kafka using the Confluent CLI?
- Command
Explain the role of the Schema Registry in Kafka.
How do you register a new schema in the Schema Registry?
What is the importance of key-value messages in Kafka?
Describe a scenario where using a random key for messages is beneficial.
Provide an example where using a constant key for messages is necessary.
Write a simple Kafka producer code that sends JSON messages to a topic.
How do you serialize a custom object before sending it to a Kafka topic?
Describe how you can handle serialization errors in Kafka producers.
Write a Kafka consumer code that reads messages from a topic and deserializes them from JSON.
How do you handle deserialization errors in Kafka consumers?
Explain the process of deserializing messages into custom objects.
What is a consumer group in Kafka, and why is it important?
Describe a scenario where multiple consumer groups are used for a single topic.
How does Kafka ensure load balancing among consumers in a group?
How do you send JSON data to a Kafka topic and ensure it is properly serialized?
Describe the process of consuming JSON data from a Kafka topic and converting it to a usable format.
Explain how you can work with CSV data in Kafka, including serialization and deserialization.
Write a Kafka producer code snippet that sends CSV data to a topic.
Write a Kafka consumer code snippet that reads and processes CSV data from a topic.
Different way to receive and ack the kafka
What makes kafka fast?

2.Explain the role of the Schema Registry in Kafka.

The Schema Registry in Kafka plays a crucial role in managing schemas for data that is sent to and from Kafka topics.

Schema Management:

Centralized Schema Repository: The Schema Registry acts as a centralized repository for schemas used in Kafka messages. It stores and manages schemas independently from the Kafka brokers.
Schema Evolution: It facilitates schema evolution by allowing compatibility checks between different versions of schemas. This ensures that producers and consumers can evolve their schemas without causing disruptions.

Example:

Suppose a producer wants to publish messages to a Kafka topic using Avro serialization. Before sending data, it registers the Avro schema with the Schema Registry, which assigns it an ID. When the producer sends a message, it includes the schema ID alongside the serialized data. Consumers retrieve the schema ID from the message, fetch the corresponding schema from the Schema Registry, and deserialize the data accordingly.

22.What makes kafka fast?

Zero-copy writes make Kafka fast, but how exactly? ⚡

Kafka is a message broker, and it accepts messages from the network and writes to the disk, and vice versa. The traditional way of moving data from network to disk involves `read` and `write` system calls, which require data to be moved to and from user space to kernel space.

Kafka leverages `sendfile` system call which copies data from one file descriptor to another within the kernel. Kafka uses this to directly transfer data from the network socket to the file on disk, bypassing unnecessary copies.

If you are interested, just read the man page of `sendfile` system call. In most cases, whenever you see something extracting extreme performance a major chunk of it comes from leveraging the right system call.

ps: I used this zero copy while building Remote Shuffle Service for Apache Spark. It proved pivotal in getting a great performance while moving multi-tb data across machines.

⚡ Admissions for my System Design June 2024 cohort are open, if you are SDE-2 and above and looking to build a rock-solid intuition to design any and every system, check out

UBer Usecase –

https://www.linkedin.com/pulse/case-study-kafka-async-queuing-consumer-proxy-vivek-bansal-lt1pc/?trackingId=sXBYzdx7T42SFdmitvQVwQ%3D%3D

Spring-boot

Index

Versions
Interview Questions

Versions

Version	Release Date	Major Features	Comment

3.2.3	February 22, 2024	Upgraded dependencies (Spring Framework 6.1.4, Spring Data JPA 3.1.3, Spring Security 6.2.2, etc.) https://www.codejava.net/spring-boot-tutorials
3.1.3	September 20, 2023	Enhanced developer experience, improved reactive support, and updated dependencies https://spring.io/blog/2022/05/24/preparing-for-spring-boot-3-0
3.0.x	May 2020 – December 2022	Introduced reactive programming, improved build system, and various dependency updates throughout the series (refer to official documentation for details)
2.x	March 2018 – May 2020	Introduced Spring Boot actuator, developer tools, and auto-configuration (refer to official documentation for specific features within each version)	2.7.7 used in project (switch)
1.x	April 2014 – February 2018	Initial versions focusing on simplifying Spring application development	1.5.22.RELEASE used in project (consumers)

Springboot versions and corresponding spring version support:

Spring Boot Version	Supported Spring Framework Versions
1.x	4.x
2.0.x – 2.3.x	5.x
2.4.x	5.x, 6.x
3.0.x – 3.2.x	6.x

Interview Questions

Why springboot over spring?
1. Convention-over-Configuration:
  - Spring Boot: Spring Boot follows convention-over-configuration principles, reducing the need for explicit configuration. Annotations like @Service are automatically recognized and configured based on conventions.
  - Spring (Traditional): In traditional Spring applications, while you can use annotations, you might need more explicit configuration, especially in XML-based configurations.
2. Auto-Configuration:
  - Spring Boot: Spring Boot provides auto-configuration, which means that common configurations are automatically applied based on the project’s dependencies. For example, if you have @Service annotated classes, Spring Boot will automatically configure them as Spring beans.
  - Spring (Traditional): In traditional Spring, you might need to configure components more explicitly, specifying details in XML files or Java-based configuration classes.
3. Reduced Boilerplate Code:
  - Spring Boot: Spring Boot’s defaults and starters significantly reduce boilerplate code. You can focus more on writing business logic and less on configuration.
  - Spring (Traditional): Without the conventions and defaults of Spring Boot, you might find yourself writing more configuration code to set up beans and application context.
4. Simplified Dependency Management:
  - Spring Boot: The use of starters simplifies dependency management. With the appropriate starter, you get a predefined set of dependencies, including those for services, making it easy to include and manage dependencies.
  - Spring (Traditional): While you can manage dependencies in traditional Spring, Spring Boot provides a more streamlined way to do so with starters.
5. Out-of-the-Box Features:
  - Spring Boot: Spring Boot provides out-of-the-box features, such as embedded servers, metrics, and health checks. These features are often automatically configured, making it easier to develop production-ready applications.
  - Spring (Traditional): While you can manually configure these features in traditional Spring, Spring Boot simplifies the process and encourages best practices.
6. Faster Project Bootstrap:
  - Spring Boot: With its starters and defaults, Spring Boot allows for faster project bootstrapping. You can create a fully functional application with minimal setup.
  - Spring (Traditional): Setting up a traditional Spring application might involve more manual configuration and a longer setup time.

Annotations in springboot
- @SpringbootApplication
  1. @EnableAutoconfiguration
  2. @ComponentScan
  3. @SpringBootConfiguration specialised form of @Configuration

Messaging

INdex

difference
version of apache kafka

Key Differences:

Active MQ vs IBM MQ / WebSphere MQ Vs Kafka

	Kafka	IBM MQ/ WebSphere MQ	ActiveMQ (Apache)
Use Case Focus	Distributed event streaming platform for handling *real-time data streams.*	Enterprise messaging system for reliable and secure communication.	General-purpose message broker for asynchronous communication.
Scalability	Built for horizontal scalability and high throughput.	Can be configured for high *availability and clustering.*	Can be horizontally scaled.
Message Persistence	Persists messages and provides fault-tolerant storage.(how?)	Emphasizes guaranteed delivery and transactional support.	Supports message persistence.
Programming Model	APIs available for multiple programming languages (Java, Scala, Python, etc.).	Supports multiple programming languages and platforms.	JMS-based, often used in Java applications.

Pages: 12

Kafka Certifications

CCDAK – Confluent Certified Developer for Apache Kafka
CCAAK- Confluent Certified Administrator for Apache Kafka®

Application link and details :

Kakfa-Certification-link

Kafka Consumption Optimisation

Kafka parameters & Performance Optimization

Following are the parameters of Kafka that can be balanced one over other for performance-

Partition : a partition is a logical unit of storage for messages. Each topic in Kafka can be divided into one or more partitions. Messages are stored in–order within each partition, and each message is assigned a unique identifier called an offset.
Number of brokers :
Number of consumer instances or no. of pods on which these instances are running
Concurrency :
Consumer group :
- Use a consumer group to scale out consumption. This will allow you to distribute the load of consuming messages across multiple consumers, which can improve throughput.
fetch size of batch data :

Optimal Partition Configuration-

Increase the number of partitions. This will allow more consumers to read messages in parallel, which will improve throughput. so it the partition and consumer should have 1:1 ration for better performance?

Note: Kafka related Bottlenecks will not occur while pushing the data because as in this case it depends on external source of data how fast it generates. Bottlenecks occurs when huge data on topic and limited consumer capacity (instances, capacity, consumption configuration etc).

Use cases:

Case 1: If Kafka consumer is struggling to keep up with the incoming data (suppose 170million events data lag). To decrease the lag and improve the performance of your Kafka setup, you can consider the following steps:

Consumer Configuration:
- Increase the number of consumer instances to match the partition count or even exceed it. Since you have 40 partitions, consider having at least 40 consumer instances. This ensures that each partition is consumed by a separate consumer, maximizing parallelism and throughput.
- Tune the consumer configuration parameters to optimize performance. Specifically, consider adjusting the fetch.min.bytes, fetch.max.wait.ms, max.poll.records, and max.partition.fetch.bytes settings to balance the trade-off between latency and throughput. Experiment with different values to find the optimal configuration for your use case.
Partition Configuration:
- Assess the data distribution pattern to ensure an even distribution across partitions. If the data is skewed towards certain partitions, consider implementing a custom partitioner or using a key-based partitioning strategy to distribute the load more evenly.
- If you anticipate further data growth or increased load, you might consider increasing the number of partitions. However, adding partitions to an existing Kafka topic requires careful planning, as it can have implications for ordering guarantees and consumer offsets.
Cluster Capacity:
- Evaluate the overall capacity and performance of your Kafka cluster. Ensure that your brokers have sufficient CPU, memory, and disk I/O resources to handle the volume of data and consumer concurrency.
- Monitor the broker metrics to identify any potential bottlenecks. Consider scaling up your cluster by adding more brokers if necessary.
Monitoring and Alerting:
- Implement robust monitoring and alerting systems to track lag, throughput, and other relevant Kafka metrics. This enables you to proactively identify issues and take appropriate actions.
Consumer Application Optimization:
- Review your consumer application code for any potential performance bottlenecks. Ensure that your code is optimized, handles messages efficiently, and avoids any unnecessary delays or blocking operations.

Spring Kafka

Index

Resources
- v3.1 features
Producer
Consumer
- consumer variations -8
- consumer factory
Todo
Findings/Answers

API Docs:

https://docs.spring.io/spring-kafka/docs/current/api/

For new features added in specific version of spring-kafka refer :

https://docs.spring.io/spring-kafka/docs/ [refer the version from below link if not knoe–>select version > refernces>htmls]
https://spring.io/projects/spring-kafka#learn

Notes to implement for performance:

https://spring.io/projects/spring-kafka#learn

linkedln :

13 ways to learn Kafka:

1. Tutorial: Official Apache Kafka Quickstart – https://lnkd.in/eVrMwgCw
2. Documentation: Official Apache Kafka Documentation – https://lnkd.in/eEU2sZvq
3. Tutorial: Kafka Learning with RedHat – https://lnkd.in/em-wsvDt
4. Read: Kafka – The Definitive Guide: Real-Time Data and Stream Processing at Scale – https://lnkd.in/ez3aCVsH
5. Course: Apache Kafka Essential Training: Getting Started – https://lnkd.in/ettejx2w
6. Read: Kafka in Action – https://lnkd.in/ed7ViYQZ
7. Course: Apache Kafka Deep Dive – https://lnkd.in/ekaB9mv6
8. Read: Apache Kafka Quick Start Guide – https://lnkd.in/e-3pSXnu
9. Course: Learn Apache Kafka for Beginners – https://lnkd.in/ewh6uUyT
10. Course: Apache Kafka Crash Course for Java and Python Developers – https://lnkd.in/e72AHUY4
11. Read: Mastering Kafka Streams and ksqlDB: Building real-time data systems by example – https://lnkd.in/eqr_DaY2
12. Course: Deploying and Running Apache Kafka on Kubernetes – https://lnkd.in/ezQ58usN
13. Course: Stream Processing Design Patterns with Kafka Streams – https://lnkd.in/egrks3rn

Kafka 3.1 features –

Micrometer observations –
- https://docs.spring.io/spring-kafka/docs/current/reference/html/#observation
Same broker for multiple test cases
Retryable topic changes are permanent.
KafkaTemplate supporting CompletableFuture(?) instead of LIstenableFuture(?).
Testing Changes
- Since 3.0.1 the application sets the default broker to application broker spring.kafka.bootstrap-servers – default embedded one.
- .

References: https://docs.spring.io/spring-kafka/docs/current/reference/html/

Points :

Starting with version 2.5 , Broker can be changed at runtime – Section “Connecting to Kafka”
- Suport For ABSwitchCluster -one cluster active at a time

Pages: 12345

Junit and Mocking and spy

Index

Junit
Mockito
testing private method (reflection ,powermock)

2 testing fameworks- MOckito abd Sping testing

	MOck	MOckBean
annotation	pary of Mockitoframework	SPring testing framework
framework	@RunWith(MockitoJUnitRunner.class) public class MyServiceTest { @Mock private MyRepository myRepository;	@SpringBootTest public class MyServiceIntegrationTest { @Autowired private MyService myService; @MockBean private MyRepository myRepository;
Purpose	UnitTetsing	Integration testing

Spy vs Mock

	@spy	@Mock
Functionality	A partially mocked version of a real object.	A complete replacement for a real object.
Creation	Wraps an existing object	Creates a new object
Control	Partial control (can define specific behaviors for specific methods)	Full control over behavior
Use case	Testing interactions within a real object with some real behavior	Isolating dependencies, unit testing with specific behavior
access	can access private methods of the original object	cannot access private methods of the original object
example	below code	below code //

Mock usage

// Interface we want to test
public interface EmailService {
  void sendEmail(String recipient, String message);
}

// Test class using mock
@Test
public void testSendEmail() {
  // Create a mock object of EmailService
  EmailService mockEmailService = Mockito.mock(EmailService.class);

  // Define behavior for the mock object
  Mockito.when(mockEmailService.sendEmail("user@example.com", "Hello world!")).thenReturn(true);

  // Use the mock object in your test logic
  myService.sendNotification(mockEmailService, "user@example.com", "Hello world!");

  // Verify interactions with the mock object
  Mockito.verify(mockEmailService).sendEmail("user@example.com", "Hello world!");
}

IN above example :

We create a mock object of EmailService using Mockito.mock.
We define behavior for the sendEmail method using Mockito.when. Here, it always returns true.
We use the mock object in the test and verify its interaction later.

Spy usage

// Real implementation of EmailService
public class RealEmailService implements EmailService {
  @Override
  public void sendEmail(String recipient, String message) {
    // Send email logic goes here (not shown for simplicity)
  }
}

// Test class using spy
@Test
public void testSendEmailWithSpy() {
  // Create a real object
  EmailService realEmailService = new RealEmailService();

  // Create a spy object from the real object
  EmailService spyEmailService = Mockito.spy(realEmailService);

  // Define behavior for specific method (optional)
  Mockito.when(spyEmailService.sendEmail("admin@example.com", "Alert!")).thenReturn("Sent successfully");

  // Use the spy object in your test logic
  myService.sendNotification(spyEmailService, "user@example.com", "Hello world!");
  myService.sendNotification(spyEmailService, "admin@example.com", "Alert!");

  // Verify interactions (optional)
  Mockito.verify(spyEmailService, times(2)).sendEmail(anyString(), anyString());
}

IN above example :

Create a real RealEmailService object.
Create a spy of the RealEmailService using Mockito.spy.
Define behavior for the sendEmail method to return a specific message for “admin@example.com” email.
Use the spy object and verify interactions (optional).

		<dependency>
			<groupId>org.powermock</groupId>
			<artifactId>powermock-module-junit4</artifactId>
			<version>1.7.4</version>
		</dependency>

https://www.learnbestcoding.com/post/21/unit-test-private-methods-and-classes

REfernce

https://www.tutorialspoint.com/mockito/mockito_spying.htm

Quartz

Quartz provides a way to schedule the recurring execution of job.

Important Points

Quartz differentiates between Job (What/Task) and Trigger (when) – so both appear as different statement
Two types pf scheduling- simple and cron based.
Quartz related properties related to cron and others are usually kept in quartz.properties

Cron expression and their meaning-

http://www.quartz-scheduler.org/documentation/quartz-2.3.0/tutorials/crontrigger.html

Questions –

Configuration – number of thread pool size.

Web-service Interview Questions

Interceptor and its use
How interceptor works
Controller Advice
How do we create interceptor
Rest
1. Richardson Rest Maturity model (4 level that defines the maturity of rest servies)
  - https://martinfowler.com/articles/richardsonMaturityModel.html
  - https://blog.restcase.com/4-maturity-levels-of-rest-api-design/
  - summarize above 2 bolgs
2. Rest Api
  1. in rest api a recommended term used to refer to multiple resources
  2. time to first hello world rest
  3. rest api to add the version in header by using accept and content type
    1. https://blog.allegro.tech/2015/01/Content-headers-or-how-to-version-api.html
3. Http Response Code
4. How communication is done between 2 rest services
5. What is Response Body
6. idempotent http req
7. restful services architecture
8. http methods
9. API security
  1. using oauth , what scope is required for write access to api
  2. which grant type support refresh token
  3. property to include in json to represent subresources /in rest api (llinks/embedded?)
benefit of GRAPHQL over rest approaches
WebHooks vs sync api and usage
how to handle transactional commit over distributed system (2phase commit , saga pattern – push or pull )
- when to use push vs when to use pull mechanism

HTTP Verb	CRUD	Entire Collection (e.g. /customers)	Specific Item (e.g. /customers/{id})
POST	Create	201 (Created), ‘Location’ header with link to /customers/{id} containing new ID.	404 (Not Found), 409 (Conflict) if resource already exists..
GET	Read	200 (OK), list of customers. Use pagination, sorting and filtering to navigate big lists.	200 (OK), single customer. 404 (Not Found), if ID not found or invalid.
PUT	Update/Replace	405 (Method Not Allowed), unless you want to update/replace every resource in the entire collection.	200 (OK) or 204 (No Content). 404 (Not Found), if ID not found or invalid.
PATCH	Update/Modify	405 (Method Not Allowed), unless you want to modify the collection itself.	200 (OK) or 204 (No Content). 404 (Not Found), if ID not found or invalid.
DELETE	Delete	405 (Method Not Allowed), unless you want to delete the whole collection—not often desirable.	200 (OK). 404 (Not Found), if ID not found or invalid.

Spring Interview Questions

Index

Spring bean scope
Use of @qualifer and @primary
Security in spring
Dependency injection
Rest controller
Difference between DI and IOC
Bean lifecycle
Req attr Param
Spring JDBC
Qualifier
Exception Handler
diff @Configuration @EnableAutoConfiguration & @ComponentScan
Circular Dependency (How to resolve)
Proxy and why needed? how to create one?

Versions

Interview Questions

Spring bean scope