Joiners
Module with join transformers.
Joiners
¶
Bases: object
Class containing join transformers.
Source code in mkdocs/lakehouse_engine/packages/transformers/joiners.py
join(join_with, join_condition, left_df_alias='a', right_df_alias='b', join_type='inner', broadcast_join=True, select_cols=None, watermarker=None)
classmethod
¶
Join two dataframes based on specified type and columns.
Some stream to stream joins are only possible if you apply Watermark, so this method also provides a parameter to enable watermarking specification.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
left_df_alias |
str
|
alias of the first dataframe. |
'a'
|
join_with |
DataFrame
|
right dataframe. |
required |
right_df_alias |
str
|
alias of the second dataframe. |
'b'
|
join_condition |
str
|
condition to join dataframes. |
required |
join_type |
str
|
type of join. Defaults to inner. Available values: inner, cross, outer, full, full outer, left, left outer, right, right outer, semi, left semi, anti, and left anti. |
'inner'
|
broadcast_join |
bool
|
whether to perform a broadcast join or not. |
True
|
select_cols |
Optional[List[str]]
|
list of columns to select at the end. |
None
|
watermarker |
Optional[dict]
|
properties to apply watermarking. |
None
|
Returns:
Type | Description |
---|---|
Callable
|
A function to be called in .transform() spark function. |