-- Run this in HIVE Editor --
SELECT startterminal, startstation, COUNT(1) AS count
FROM bay_area_bike_share
GROUP BY startterminal, startstation
ORDER BY count DESC LIMIT 10
-- Run this in HIVE Editor --
SELECT startstation, COUNT(1) AS count
FROM bay_area_bike_share
GROUP BY startstation
ORDER BY count ASC LIMIT 10
-- Run this in HIVE Editor --
SELECT endstation, COUNT(1) AS count
FROM bay_area_bike_share
GROUP BY endstation
ORDER BY count DESC LIMIT 10
Hi Vinay, hoping you can help! With regards to the Bike Sharing Assignment.... " Utilizing the Bay Area Bike Share database (both Year 1 & 2, Aug. 2013- Aug. 2015)- what is the most popular start station based on trip data?
ReplyDeleteCan you tell me how you joined both Year 1 & 2 to query at the same time? Your table is "bay_area_bike_share". Does this table contain both Year 1 and 2 data?"
Thanks!
Not sure if your still wondering about this but you should be able to do it with the command UNION....
DeleteSELECT t.startstation, COUNT(t.startstation) AS c
FROM (
SELECT startstation FROM 201402_tripdata
UNION ALL
SELECT startstation FROM 201408_tripdata
UNION ALL
SELECT startstation FROM 201508_tripdata
) t
GROUP BY t.startstation
ORDER BY c DESC
LIMIT 10
Q3 uses both Dataset 1 and 2
Q2 looks at only the first dataset
Q1 looks at all 3 data sets.
At least that is how I read it.