AWS Glue Commands
1. Simple User migration
Userid from ghid to xyid
User:
ghid, name, email, gpa
User_new:
xyid, Name, Email, gpa_int
S3 Bucket:
cxh-migration-report
Migration-Glue
def MyTransform (glueContext, dfc) -> DynamicFrameCollection:
Custom Job:
https://aws-dojo.com/ws23/labs/create-job/
https://aws-glue-intro.workshop.aws/intro.html
https://aws-glue-intro.workshop.aws/lab6/custom-transformation.html
1. Add more columns with product details on S3 CSV file. Apply only for specific products.
If product_type is ABC, call function get_product_details_abc()
else, call function get_product_details_xyz()
2. Do the same in 1 but call third party API to get the details
https://aws-dojo.com/ws23/labs/
RISK 3:
Create a new column named
Call REST API to get the new column
https://gist.github.com/rajasgs/5742d38e8ee1a4dc423bf7d9ea19bded
https://aws-glue-intro.workshop.aws/prerequisites/s3-and-local-file.html
BUCKET_NAME=migrationglue-kde
aws s3 mb s3://${BUCKET_NAME}
aws s3api put-public-access-block --bucket ${BUCKET_NAME} \
--public-access-block-configuration "BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true"
echo ${BUCKET_NAME}
cd ~/environment
wget https://aws-glue-intro.workshop.aws/download/glue-workshop.zip
unzip glue-workshop.zip
mkdir ~/environment/glue-workshop/library
mkdir ~/environment/glue-workshop/output
git clone https://github.com/jefftune/pycountry-convert.git
cd ~/environment/pycountry-convert
zip -r pycountry_convert.zip pycountry_convert/
mv ~/environment/pycountry-convert/pycountry_convert.zip ~/environment/glue-workshop/library/
cd ~/environment/glue-workshop
aws s3 cp --recursive ~/environment/glue-workshop/code/ s3://${BUCKET_NAME}/script/
aws s3 cp --recursive ~/environment/glue-workshop/data/ s3://${BUCKET_NAME}/input/
aws s3 cp --recursive ~/environment/glue-workshop/library/ s3://${BUCKET_NAME}/library/
aws s3 cp --recursive s3://covid19-lake/rearc-covid-19-testing-data/json/states_daily/ s3://${BUCKET_NAME}/input/lab5/json/
aws cloudformation create-stack --stack-name migrationglue \
--template-body file://~/environment/glue-workshop/cloudformation/NoVPC.yaml \
--capabilities CAPABILITY_NAMED_IAM \
--region us-east-2 \
--parameters \
ParameterKey=UniquePostfix,ParameterValue=glueworkshop \
ParameterKey=S3Bucket,ParameterValue=s3://${BUCKET_NAME}/
https://docs.aws.amazon.com/cli/latest/reference/s3api/index.html
https://aws-glue-intro.workshop.aws/lab1/create-crawler.html
Last updated
Was this helpful?