Joins – Part 3

Today we will see how to work with joins in Apache Hive. In my previous post “Joins – Part 1” we have seen What are joins?
 
1. Inner Join

-- inner join
SELECT person.empid, person.name, person.birthdate, person.birthcountry, person.phone, person.emailaddress,
       employees.gender, employees.maritalstatus, employees.jobtitle, employees.annualrate, employees.startdate,
       employees.terminationdate
FROM   person person JOIN employees employees
         ON (person.empid = employees.empid);

Inner join output resultset —

KC0001  Martina Griffiths       1956-03-31      CAN     9256947901      HCMGENUser1@ap6023fems.us.oracle.com    F       M       Secretary-Senior        31500.0 1990-01-01      NULL
KC0003  Cynthia Rogers  1950-07-11              9256947903      HCMGENUser1@ap6023fems.us.oracle.com    F       S       Clerk-Accounting        23010.0 1988-03-12      NULL
KC0015  Barry Campbell  1950-12-31      CAN     9256947915      HCMGENUser1@ap6023fems.us.oracle.com    M       S       Specialist-Training     43200.0 1984-11-02      2000-01-28
KC0016  Paulette Vaillancourt   1929-12-10              9256947916      HCMGENUser1@ap6023fems.us.oracle.com    F       M       Analyst-Human Resources 52320.0 1981-03-20      1999-12-31
KU0055  Netty Owyang    1958-08-02              9255556547      netty.owyang@xyzcompany.com     F       S       Manager-Finance 60008.0 1988-03-21      NULL
KU0059  Vicki Zinn      1958-04-16              9256947959      vicki.zinn@co.com       F       M       Analyst-Financial       48006.4 1988-07-16      NULL
KU0067  Alex Passantino 1967-10-01      USA     9256947967      HCMGENUser1@ap6023fems.us.oracle.com    M       M       Vice President-Intl Ops 182200.0        1991-02-10      NULL
KWG126  Johann Moser    1950-04-15      CHE     49891430        HCMGENUser1@ap6023fems.us.oracle.com    M       M       Cleaner 20000.0 1965-03-01      2005-12-31
KWG127  Anita Zahnd     1966-05-23      CHE     49891430        HCMGENUser1@ap6023fems.us.oracle.com    F       S       Researcher      110000.0        1985-02-01      2005-12-31
KWG128  Susette Racine  1946-08-26      CHE     49891430        HCMGENUser1@ap6023fems.us.oracle.com    F       M       Other Job2      20000.0 1960-04-01      2005-12-31

 
2. Full Outer Join

-- full outer join
SELECT IF(employees.empid is NULL, person.empid, employees.empid), person.name, person.birthdate,
       person.birthcountry, person.phone, person.emailaddress, employees.gender, employees.maritalstatus,
       employees.jobtitle, employees.annualrate, employees.startdate, employees.terminationdate
FROM   person person FULL JOIN employees employees
         ON (person.empid = employees.empid);

Full outer join output resultset —

KC0001  Martina Griffiths       1956-03-31      CAN     9256947901      HCMGENUser1@ap6023fems.us.oracle.com    F       M       Secretary-Senior        31500.0 1990-01-01      NULL
KC0003  Cynthia Rogers  1950-07-11              9256947903      HCMGENUser1@ap6023fems.us.oracle.com    F       S       Clerk-Accounting        23010.0 1988-03-12      NULL
KC0015  Barry Campbell  1950-12-31      CAN     9256947915      HCMGENUser1@ap6023fems.us.oracle.com    M       S       Specialist-Training     43200.0 1984-11-02      2000-01-28
KU0056  NULL    NULL    NULL    NULL    NULL    M       S       Consultant-Junior       48006.4 1988-04-03      NULL
KU0057  NULL    NULL    NULL    NULL    NULL    F       M       Manager-Project 60008.0 1988-05-17      2005-12-31
KU0058  NULL    NULL    NULL    NULL    NULL    F       M       Specialist-Pension      26000.0 1988-06-11      NULL
KWG110  Heinz Ganz      1986-02-28      CHE     49891430        HCMGENUser1@ap6023fems.us.oracle.com    NULL    NULL    NULL    NULL    NULL    NULL
KWG111  Monica Herz     1966-06-30      CHE     49891430        HCMGENUser1@ap6023fems.us.oracle.com    NULL    NULL    NULL    NULL    NULL    NULL
KWG112  Rosa Inglese    1979-10-15      CHE     49891430        HCMGENUser1@ap6023fems.us.oracle.com    NULL    NULL    NULL    NULL    NULL    NULL

 
3. Left Outer Join

-- left outer join
SELECT IF(employees.empid is NULL, person.empid, employees.empid), person.name, person.birthdate,
       person.birthcountry, person.phone, person.emailaddress, employees.gender, employees.maritalstatus,
       employees.jobtitle, employees.annualrate, employees.startdate, employees.terminationdate
FROM   person person LEFT JOIN employees employees
         ON (person.empid = employees.empid);

Left outer join output resultset —

KC0001  Martina Griffiths       1956-03-31      CAN     9256947901      HCMGENUser1@ap6023fems.us.oracle.com    F       M       Secretary-Senior        31500.0 1990-01-01      NULL
KC0003  Cynthia Rogers  1950-07-11              9256947903      HCMGENUser1@ap6023fems.us.oracle.com    F       S       Clerk-Accounting        23010.0 1988-03-12      NULL
KC0015  Barry Campbell  1950-12-31      CAN     9256947915      HCMGENUser1@ap6023fems.us.oracle.com    M       S       Specialist-Training     43200.0 1984-11-02      2000-01-28
KWG110  Heinz Ganz      1986-02-28      CHE     49891430        HCMGENUser1@ap6023fems.us.oracle.com    NULL    NULL    NULL    NULL    NULL    NULL
KWG111  Monica Herz     1966-06-30      CHE     49891430        HCMGENUser1@ap6023fems.us.oracle.com    NULL    NULL    NULL    NULL    NULL    NULL
KWG112  Rosa Inglese    1979-10-15      CHE     49891430        HCMGENUser1@ap6023fems.us.oracle.com    NULL    NULL    NULL    NULL    NULL    NULL

 
4. Right Outer Join

-- left outer join
SELECT IF(employees.empid is NULL, person.empid, employees.empid), person.name, person.birthdate,
       person.birthcountry, person.phone, person.emailaddress, employees.gender, employees.maritalstatus,
       employees.jobtitle, employees.annualrate, employees.startdate, employees.terminationdate
FROM   person person RIGHT JOIN employees employees
         ON (person.empid = employees.empid);

Right outer join output resultset —

KC0001  Martina Griffiths       1956-03-31      CAN     9256947901      HCMGENUser1@ap6023fems.us.oracle.com    F       M       Secretary-Senior        31500.0 1990-01-01      NULL
KC0003  Cynthia Rogers  1950-07-11              9256947903      HCMGENUser1@ap6023fems.us.oracle.com    F       S       Clerk-Accounting        23010.0 1988-03-12      NULL
KC0015  Barry Campbell  1950-12-31      CAN     9256947915      HCMGENUser1@ap6023fems.us.oracle.com    M       S       Specialist-Training     43200.0 1984-11-02      2000-01-28
KU0056  NULL    NULL    NULL    NULL    NULL    M       S       Consultant-Junior       48006.4 1988-04-03      NULL
KU0057  NULL    NULL    NULL    NULL    NULL    F       M       Manager-Project 60008.0 1988-05-17      2005-12-31
KU0058  NULL    NULL    NULL    NULL    NULL    F       M       Specialist-Pension      26000.0 1988-06-11      NULL

 
For more on Hive commands, functions, and UDFs follow Qubole Hive Cheatsheet.