AWS Glue ã¯æœåºã倿ãããŒã (ETL) ãè¡ãå®å šãããŒãžãåã®ãµãŒãã¹ã§ãã客æ§ã®åæçšããŒã¿ã®æºåãšããŒããç°¡åã«ããŸããAWS ãããžã¡ã³ãã³ã³ãœãŒã«ã§æ°åã¯ãªãã¯ããã ãã§ãETL ãžã§ããäœæããã³å®è¡ã§ããŸããAWS Glue ã§ã¯ãAWS ã«ä¿åãããããŒã¿ãæå®ããã ãã§ AWS Glue ã«ããããŒã¿æ€çŽ¢ãè¡ãããããŒãã«å®çŸ©ãã¹ããŒããªã©ã®é¢é£ããã¡ã¿ããŒã¿ã AWS Glue ããŒã¿ã«ã¿ãã°ã«ä¿åãããŸããã«ã¿ãã°ã«ä¿åãããããŒã¿ã¯ãããã«æ€çŽ¢ãã¯ãšãªãETL ã§äœ¿çšã§ããŸããAWS Glue ã§ã¯ãããŒã¿å€æãšããŒã¿ã®ããŒãããã»ã¹ãå®è¡ããã³ãŒããçæãããŸãã
AWS Glue ã§çæãããã³ãŒãã¯ãã«ã¹ã¿ãã€ãºæ§ãåå©çšæ§ã坿¬æ§ãåããŠããŸããETL ãžã§ãã®äœæãå®äºããããAWS Glue ã®ãã«ãããŒãžãå Apache Spark ã¹ã±ãŒã«ã¢ãŠãç°å¢ã§ãžã§ãã®å®è¡ãã¹ã±ãžã¥ãŒã«ã§ããŸããAWS Glue ã§ã¯ãäŸåæ§ã®è§£æ±ºããžã§ãã®ã¢ãã¿ãªã³ã°ãã¢ã©ãŒããè¡ãæè»ãªã¹ã±ãžã¥ãŒã©ãæäŸããŸãã
AWS Glue ã¯ãµãŒããŒã¬ã¹ã§ãããããã€ã³ãã©ã¹ãã©ã¯ãã£ã®è³Œå ¥ãèšå®ã管çã¯äžèŠã§ãããžã§ãã®å®è¡ã«å¿ èŠãªç°å¢ãèªåçã«ããããžã§ãã³ã°ãããŸãããŸããã客æ§ãæ¯æãã®ã¯ãETL ãžã§ãã®å®è¡äžã«äœ¿çšããã³ã³ãã¥ãŒãã£ã³ã°ãªãœãŒã¹ã®è²»çšã®ã¿ã§ããåæçšã®ããŒã¿ã¯æ°åã§æºåã§ããŸãã
ETL ãéå§ããæºåã¯ã§ããŸãããïŒ
AWS Glue ã®äœ¿çšãéå§ãã
ç°¡å
AWS Glue ã§ã¯ãæéã®ããã ETL ãžã§ãã®æ§ç¯ã管çãå®è¡ã®ã»ãšãã©ãèªååãããŸããAWS Glue ã¯ããŒã¿ãœãŒã¹ãèªåçã«ã¯ããŒã«ããããŒã¿ãã©ãŒããããèå¥ããŠã¹ããŒããšå€æãææ¡ããŸããAWS Glue ã§ã¯ãããŒã¿å€æãšããŒãããã»ã¹ãå®è¡ããã³ãŒããèªåçã«çæãããŸãã
çµ±å
AWS Glue ã¯ãAWS ã®å¹ åºããµãŒãã¹ãšçµ±åãããŠããŸããAmazon AuroraãAmazon RDS for MySQLãAmazon RDS for OracleãAmazon RDS for PostgreSQLãAmazon RDS for SQL ServerãAmazon RedshiftãAmazon S3 ã«ä¿åãããããŒã¿ã«å ããAmazon EC2 ã§å®è¡ãããŠãã Virtual Private Cloud (Amazon VPC) å ã® MySQLãOracleãMicrosoft SQL ServerãPostgreSQL ãªã©ã®ããŒã¿ããŒã¹ã AWS Glue ã§ãã€ãã£ãã§ãµããŒããããŸããAWS Glue ã¯ãåæç¶æ ã§ Amazon AthenaãAmazon EMRãAmazon Redshift Spectrumãããã³ä»»æã® Apache Hive Metastore äºæã¢ããªã±ãŒã·ã§ã³ãšçµ±åãããŠããŸãã
ãµãŒããŒã¬ã¹
AWS Glue ã¯ãµãŒããŒã¬ã¹ã§ããã客æ§ãã€ã³ãã©ã¹ãã©ã¯ãã£ãããããžã§ãã³ã°ããã³ç®¡çããå¿ èŠã¯ãããŸãããAWS Glue åŽã§ãETL ãžã§ãã®å®è¡ã«å¿ èŠãªãªãœãŒã¹ã®ããããžã§ãã³ã°ãèšå®ãã¹ã±ãŒãªã³ã°ãåŠçããããã«ãããŒãžãåã® Apache Spark ã¹ã±ãŒã«ã¢ãŠãç°å¢ã§å®è¡ã§ããããã«ãªããŸããã客æ§ãæ¯æãã®ã¯ããžã§ãã®å®è¡äžã«äœ¿çšãããªãœãŒã¹ã®æéã®ã¿ã§ãã
éçºè ã«ãšã£ãŠäœ¿ãããã
AWS Glue ã§ã¯ãéçºè ã«éŠŽæã¿ã®ãã ScalaãPythonãApache Spark ã䜿çšããŠãã«ã¹ã¿ãã€ãºæ§ãåå©çšæ§ã坿¬æ§ãåãã ETL ã³ãŒããçæã§ããŸããGlue ETL ã«èªã¿åããæžã蟌ã¿ã倿ã®ã«ã¹ã¿ã æ©èœãã€ã³ããŒãããããšãã§ããŸããAWS Glue ãçæããã³ãŒãã¯ãªãŒãã³ãã¬ãŒã ã¯ãŒã¯ã«åºã¥ããããå²ã蟌ã¿ã®å¿é ã¯ãããŸãããã©ãã§ã䜿çšã§ããŸãã
ç»åãã¯ãªãã¯ããŠæ¡å€§
ãŸããAWS ãããžã¡ã³ãã³ã³ãœãŒã«ã䜿çšããŠãããŒã¿ãœãŒã¹ãç»é²ããŸããAWS Glue ã«ãã£ãŠããŒã¿ãœãŒã¹ãã¯ããŒã«ãããJSONãCSVãParquet ãšãã£ãå€ãã®äžè¬çãªãœãŒã¹ãã©ãŒããããããŒã¿ã¿ã€ãã«å¯ŸããŠäºåã«æ§ç¯ãããåé¡åã䜿çšããŠããŒã¿ã«ã¿ãã°ãæ§ç¯ãããŸãã
ç»åãã¯ãªãã¯ããŠæ¡å€§
次ã«ãããŒã¿ã®ãœãŒã¹ãšã¿ãŒã²ãããéžæããŸããAWS Glue ã§ã¯ Scala ãŸã㯠Python ã§ ETL ã³ãŒããçæããŸãããã®ã³ãŒãã䜿çšããŠããœãŒã¹ããã®ããŒã¿æœåºãã¿ãŒã²ããã®ã¹ããŒãã«åãããããŒã¿å€æãã¿ãŒã²ãããžã®ããŒããè¡ããŸããçæãããã³ãŒãã¯ãã³ã³ãœãŒã«ãä»»æã® IDEãããã¹ããšãã£ã¿ã§ç·šéããããã°ããã¹ãã§ããŸãã
ç»åãã¯ãªãã¯ããŠæ¡å€§
AWS Glue ã§ã¯ã宿ç㪠ETL ãžã§ãã®ã¹ã±ãžã¥ãŒãªã³ã°ãè€æ°ã®ãžã§ãã®é£çµãAWS Lambda ãšãã£ãä»ã®ãµãŒãã¹ããã®ãªã³ããã³ãã«ãããžã§ãåŒã³åºããç°¡åã«å®è¡ã§ããŸããAWS Glue ã§ã¯ããžã§ãéã®äŸåé¢ä¿ã管çãããåºç€ãšãªããªãœãŒã¹ãèªåçã«ã¹ã±ãŒã«ããã倱æãããžã§ããèªåçã«å詊è¡ãããŸãã
詳现ã«ã€ããŠã¯ãAWS Glue ã®è£œåã®è©³çްããŒãžã«ã¢ã¯ã»ã¹ããããAWS ã®è£œåããã¥ã¡ã³ããã芧ãã ããã
AWS Glue ã䜿çšãããšãããŒã¿ã»ãããã¯ãªãŒãã³ã°ãæ£èŠåããšã³ãªããããŠãåæã®ããã«ã¯ãªãã¯ã¹ããªãŒã ã®æºåããã°ããŒã¿ã®åŠçãå®è¡ã§ããŸããAWS Glue ã§ã¯ãåæ§é åããŒã¿ã«å¯Ÿããã¹ããŒãã®çæãããŒã¿ã倿ãå¹³åŠåããšã³ãªããããããã® ETL ã³ãŒãã®äœæãããŒã¿ãŠã§ã¢ããŠã¹ããã®å®æçãªããŒããå®è¡ã§ããŸãã
AWS Glue ããŒã¿ã«ã¿ãã°ã䜿çšãããšãAWS ã®è€æ°ã®ããŒã¿ã»ããã«é 眮ãããããŒã¿ããç§»åããã«ç°¡åã«æ€åºããã³æ€çŽ¢ã§ããŸããã«ã¿ãã°åãããããŒã¿ã¯ãããã« Amazon AthenaãAmazon EMRãAmazon Redshift Spectrum ã䜿çšããæ€çŽ¢ãã¯ãšãªã§å©çšã§ããããã«ãªããŸãã
ããŒã¿ã¬ã€ã¯ã¯ãæ§é åããŒã¿ããã³éæ§é åããŒã¿ã®ä¿åãšåæãè¡ãææ®µãšããŠäººæ°ãé«ãŸã£ãŠããŸããAmazon S3 ã®ããŒã¿ã¬ã€ã¯ã䜿çšããŠããå ŽåãAWS Glue ã«ãã£ãŠããã¹ãŠã®ããŒã¿ãç¬æã«åæçšã«æºåã§ããŸããããŒã¿ãç§»åããå¿ èŠã¯ãããŸãããGlue ã®ã¯ããŒã©ã«ãã£ãŠããŒã¿ã¬ã€ã¯ãã¹ãã£ã³ãããåºç€ãšãªãããŒã¿ãš Glue ããŒã¿ã«ã¿ãã°ãšã®åæãç¶æãããŸãããã®äžã§ãAmazon Athena ããã³ Amazon Redshift Spectrum ãããããŒã¿ã¬ã€ã¯ã«å¯Ÿããã¯ãšãªãçŽæ¥éä¿¡ã§ããŸãããŸããAmazon EMR ã§å®è¡ãããããã°ããŒã¿åŠçã®ã¢ããªã±ãŒã·ã§ã³ã§å©çšããããã«ãGlue ããŒã¿ã«ã¿ãã°ãå€éšã® Apache Hive ã¡ã¿ã¹ãã¢ãšããŠäœ¿çšããããšãã§ããŸãã
AWS Glue ã§ã¯ãæ°ããããŒã¿ã»ããã®ååŸãšãã£ãã€ãã³ãã«åºã¥ã㊠ETL ãžã§ããå®è¡ã§ããŸããäŸãã°ãAWS Lambda 颿°ã䜿çšããŠãæ°ããããŒã¿ã Amazon S3 ã§å©çšå¯èœã«ãªããšããã« ETL ãžã§ããããªã¬ãŒãããããã«èšå®ã§ããŸãããŸããETL ãžã§ãã®åŠçã®äžç°ãšããŠããã®ãããªæ°ããããŒã¿ã»ããã AWS Glue ããŒã¿ã«ã¿ãã°ã«ç»é²ããããšãã§ããŸãã
AWS Glue ã®äœ¿çšãéå§ããã®ã¯ç°¡åã§ããAWS ãããžã¡ã³ãã³ã³ãœãŒã«ã«ãµã€ã³ã€ã³ããŠã[åæ] ã«ããŽãªã®äžã«ãã [AWS Glue] ãã¯ãªãã¯ããŠãã ããã

