BD201Notes
  • Notes
  • Structure
  • Links
  • Topics for writiing task
    • HDFS
    • YARN
    • Hive
    • Kafka
    • Spark
    • ELK
  • Questions
    • Questions with answers
      • 2. HDFS
      • 3. DEVOPS
      • 4. HIVE
      • 7. STREAMING
      • 8. Elastic
    • Questions for screanings
      • junior
  • DevOps
    • DevOps: Practices
    • DevOps: Responsibilities in a Cloud
    • DevOps: CI/CD
    • DevOps: Tools. Docker
    • DevOps: Tools. Jenkins
    • DevOps: Tools. Puppet
    • DevOps: Tools. Ansible
    • DevOps: Tools. Kubernetes
  • Introduction To Big Data
    • Value of Data
    • Types of Data (Structured and Unstructured)
    • Schema on read & schema on write
    • Speed of data – from real time to batched processing
    • Storage & formats of data & compression
  • Hadoop Core
    • Hadoop Ecosystem
    • Hadoop Architecture
    • YARN
      • YARN: Logical and physical projections
      • YARN: Scheduler
      • YARN: Exec Modes
      • YARN: Security
    • MAP REDUCE
      • MAP REDUCE: Hints
  • Data Flow
    • NiFi
      • NiFi. HDP pre-configuration guide
      • StreamSets
  • Spark
    • Links
  • ELK
    • Elasticsearch
      • Elasticsearch: Installation and Configuration
    • Kibana
      • Kibana: Installation
    • Logstash
    • Beats
  • Orchestration & Scheduling
    • Apache Airflow
      • Airflow: Notes
Powered by GitBook
On this page

Was this helpful?

  1. Topics for writiing task

Kafka

  • Componetns (Producer, Consumer, Broker, Zookeeper)

  • Topics Partitions & Replicas

  • Leader and ISR in Kafka

  • Kafka Log pruning

  • Kafka delivery semantics

  • Micro batch vs. continuous processing mode

  • Kafka VS Kafka streaming

  • Stream Partitions and Tasks in Kafka streaming

  • Threads in Kafka streaming

  • State Stores in Kafka streaming

PreviousHiveNextSpark

Last updated 5 years ago

Was this helpful?