Some programming

Posted by TX on February 12, 2025

General Computer Science questions

Hash Tables

What is a hash table, and how does it work?

  • A hash table is a data structure that maps keys to values using a hash function to compute an index into an array of buckets.
  • It allows for fast lookups, inserts, and deletions.

Time Complexity:

  • Average Case: ( O(1) ) for insertion, search, and delete.
  • Worst Case: ( O(n) ) for all three operations when all keys map to the same bucket due to collisions.

What is a hash collision, and how can it be handled?

  • A hash collision occurs when two different keys hash to the same index.
  • Collision Handling Techniques:
    • Chaining: Use linked lists to store multiple keys in the same bucket.
    • Open Addressing: Probes the next available slot in the array.

What is a good hash function?

  • Uniformly distributes keys across the hash table.
  • Minimizes collisions.
  • Efficient to compute.

Heap

What is a heap, and how is it used?

  • A heap is a specialized binary tree used to maintain a priority queue.
  • Min-Heap: The smallest element is at the root.
  • Max-Heap: The largest element is at the root.

Time Complexity:

  • Find Min/Max: ( O(1) )
  • Insert/Delete: ( O(\log n) )
  • Building a Heap: ( O(n) )
  • Heap Sort: ( O(n \log n) )

BST vs. Heap

  • Heap: Ensures the highest/lowest priority element is always at the root.
  • BST: Maintains an ordered structure where the left child is smaller and the right child is larger than the parent.

Time Complexity:

  • Search in BST:
    • Average: ( O(\log n) )
    • Worst Case (Unbalanced): ( O(n) )

SQL Concepts

What is the difference between a primary key and a unique key?

  • Primary Key: Uniquely identifies a row and cannot have NULL values.
  • Unique Key: Ensures uniqueness but allows one NULL value.

What is the purpose of indexing in a database?

  • Indexes improve the speed of data retrieval by allowing faster row lookups compared to scanning the entire table.
  • Especially useful for large datasets.

What is normalization in SQL?

  • The process of organizing a database to:
    • Reduce redundancy.
    • Improve data integrity.
    • Achieved by dividing large tables into smaller ones and defining relationships between them.

What is a foreign key in SQL?

  • A column (or group of columns) that establishes a link between data in two tables.
  • Enforces referential integrity by ensuring a value in one table matches a value in another.

What is the difference between INNER JOIN and OUTER JOIN?

  • INNER JOIN: Returns rows with matching values in both tables.
  • OUTER JOIN:
    • LEFT OUTER JOIN: Includes all rows from the left table, with NULL for unmatched rows in the right table.
    • RIGHT OUTER JOIN: Includes all rows from the right table, with NULL for unmatched rows in the left table.
    • FULL OUTER JOIN: Includes all rows from both tables, with NULL for unmatched rows.

Big O Notation

  • ( f(x) = O(g(x)) ) if there exist constants ( M > 0 ) and ( n > 0 ) such that: [ |f(x)| \leq M \cdot |g(x)| \text{ for all } x > n. ]

Linked List vs. Array

| Operation | Linked List | Array | |——————————|—————–|——————–| | Access by Index | ( O(n) ) | ( O(1) ) | | Search (Sorted) | ( O(n) ) | ( O(\log n) ) | | Insertion/Deletion at Head | ( O(1) ) | ( O(n) ) (may require resizing) | | Insertion/Deletion at Index | ( O(n) ) | ( O(n) ) |


C++ Concepts

Key Features

  1. Speed and Low Latency: Ideal for real-time risk analysis and pricing.
  2. Memory Allocation:
    • malloc() and free() for heap memory (manual management).
    • Classes typically use static memory (automatically managed).

Multithreading:

  • Unlike Python, C++ does not have a Global Interpreter Lock (GIL), enabling true multithreading.

Namespaces:

  • Avoid name collisions by organizing code logically.

Object-Oriented Programming (OOP):

  • Inheritance: Base and derived classes.
  • Polymorphism: Runtime behavior via virtual functions.
  • Encapsulation: Hiding data within classes.
  • Abstraction: Simplifying complex systems.

Structs:

  • Groups different data types into a single entity.

Floating-Point Types:

  • float: 32-bit precision.
  • double: 64-bit precision.

Virtual Functions:

  • Enable runtime polymorphism and allow overriding parent methods.

Pure virtual function

  • a virtual function that is declared within a base class but has no implementation in that class

pure virtual vs virtual:

Aspect Pure Virtual Virtual
Definition Declared with = 0, no implementation in the base class. Can have an implementation in the base class.
Syntax virtual void func() = 0; virtual void func() { ... };

Aspect Pure Virtual Virtual
Primary Purpose Defines an interface in the base class. Provides a default behavior in the base class.
Requirement Derived classes must override the function unless the derived class is also abstract. Derived classes may override the function, but it is not required.
Base Class Usage Makes the base class abstract (non-instantiable). The base class can still be instantiated.
Polymorphism Ensures all derived classes implement the same interface. Allows derived classes to optionally customize behavior.

hiding in C++

Derived class also define function in same name as base class (just like polymorphism with virtual).

but provide a new definition in the derived class that is unrelated to the base class member.

The function in base class is just void, not virtual.

Hiding uses compile-time binding, virtual functions uses runtime binding.

memory leak, memory overhead, memory overflow, memory corruption

  • Memory leak: Program fails to release memory, should free memory after use
  • Memory overhead: Program requiring extra memory (e.g. dynamic memory allocation)
  • Memory overflow: Program uses more memory than available
    (e.g. stack overflow, call stack reaches max recursion depth)
  • Memory corruption: Program modifies memory it does not own
    Can lead to crashes and data loss

Python

  1. How to handle exceptions?
    Use try, except, else, and finally blocks to catch and handle exceptions.

  2. Difference between deepcopy and shallowcopy?
    • Deepcopy: Copies all objects recursively.
    • Shallowcopy: Only copies the reference for nested objects.
  3. Pandas
    • How is a Pandas DataFrame different from an SQL table?

Both represent tabular data, but a Pandas DataFrame is an in-memory object for data manipulation in Python, while an SQL table is part of a relational database.

  • How to detect missing data

isnull()

  • Difference between loc and iloc?

loc uses column/row labels
iloc uses integet indices

  • concat vs merge vs join

concat(): Stacks DataFrames along rows or columns.
merge(): Combines DataFrames based on common columns or indices, similar to SQL joins.
join(): Combines DataFrames based on their indices.

  1. Multithreading vs Multiprocessing

Excel

  1. Relative, absolute, and mixed references?
    • Relative: Adjusts as you copy (A1).
    • Absolute: Fixed reference ($A$1).
    • Mixed: Fixes row or column ($A1 or A$1).
  2. Conditional formatting?
    Apply rules to highlight cells based on criteria, like highlighting values > 100.

  3. Common functions?
    • VLOOKUP/HLOOKUP: Find values in rows/columns.
    • INDEX-MATCH: Dynamic lookups.
    • IF: Conditional logic.
    • SUMIF: Sum based on a condition.
  4. Pivot tables?
    Summarize, group, and analyze data dynamically by dragging fields into rows, columns, and values.

  5. Data validation?
    Restricts user input in cells, e.g., dropdown lists or numeric limits.

VBA

  1. What is VBA?
    A programming language in Excel used for automation, macros, and creating custom functions.

  2. Key VBA objects?
    • Workbook: The Excel file.
    • Worksheet: A sheet in the workbook.
    • Range: A cell or group of cells.
  3. How to debug VBA code?
    Use breakpoints, the Immediate Window, and the Debug.Print method.

  4. How to run a macro?
    Use the Developer tab > Macros or assign it to a button.

  5. VBA for data analysis?
    Automates repetitive tasks like importing data, formatting, and creating charts.

SQL

  1. Primary key vs unique key?
    • Primary Key: Ensures uniqueness and no NULL values.
    • Unique Key: Ensures uniqueness but allows one NULL value.

e.g. For a table of students, StudentID can be primary key (must-have, unique, cannot be null)
Student_name can be unique key also unique but can be null

  1. What is indexing?
    Improves query speed by creating a data structure for fast lookups on columns.

  2. What is normalization?
    Organizing data to reduce redundancy and ensure integrity by dividing tables logically.

  3. What is a foreign key?
    Links data between tables, enforcing that values in one table exist in another.
    e.g. In another table called Course_Enrolment, can have StudentID columns referencing students taking this course

  4. INNER JOIN vs OUTER JOIN?

    • INNER JOIN: Returns only matching rows.
    • OUTER JOIN: Includes unmatched rows with NULLs.

R

  1. What is R used for?
    Statistical computing, data visualization, and machine learning.

  2. What are R’s data structures?
    • Vectors
    • Matrices
    • Data Frames
    • Lists
  3. How to install packages in R?
    Use install.packages("package_name") and library(package_name) to load.

  4. What is ggplot2?
    A popular R library for creating customizable visualizations.

  5. How to handle missing data?
    Use na.omit() to remove or impute() to fill missing values.