HTMLテーブルを変換する方法は複数あります into CSVとExcelスプレッドシートを使用して GrabzItのPython APIここで詳しく説明するのは、最も便利なテクニックの一部です。 ただし、開始する前に、 URLToTable, HTMLToTable or FileToTable メソッド Save or SaveTo メソッドを呼び出してテーブルをキャプチャする必要があります。 このサービスが適切かどうかをすばやく確認したい場合は、 HTMLテーブルのキャプチャのライブデモ URLから。
以下のコードスニペットは、指定されたWebページの最初のHTMLテーブルを自動的に変換します intダウンロードまたは解析できるCSVドキュメント。
grabzIt.URLToTable("https://www.tesla.com") # Then call the Save or SaveTo method
grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr> <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr> </table></body></html>") # Then call the Save or SaveTo method
grabzIt.FileToTable("tables.html") # Then call the Save or SaveTo method
デフォルトでは、これは、識別する最初のテーブルを変換します intテーブル。 ただし、Webページの2番目のテーブルは、2を tableNumberToInclude
属性。
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.tableNumberToInclude = 2 grabzIt.URLToTable("https://www.tesla.com", options) # Then call the Save or SaveTo method grabzIt.SaveTo("result.csv")
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.tableNumberToInclude = 2 grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr> <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr> </table></body></html>", options) # Then call the Save or SaveTo method grabzIt.SaveTo("result.csv")
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.tableNumberToInclude = 2 grabzIt.FileToTable("tables.html", options) # Then call the Save or SaveTo method grabzIt.SaveTo("result.csv")
また指定することができます targetElement
指定された要素ID内のテーブルのみが変換されることを保証する属性。
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.targetElement = "stocks_table" grabzIt.URLToTable("https://www.tesla.com", options) # Then call the Save or SaveTo method grabzIt.SaveTo("result.csv")
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.targetElement = "stocks_table" grabzIt.HTMLToTable("<html><body><table id='stocks_table'><tr><th>Name</th><th>Age</th></tr> <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr> </table></body></html>", options) # Then call the Save or SaveTo method grabzIt.SaveTo("result.csv")
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.targetElement = "stocks_table" grabzIt.FileToTable("tables.html", options) # Then call the Save or SaveTo method grabzIt.SaveTo("result.csv")
または、trueを渡すことにより、Webページ上のすべてのテーブルをキャプチャできます。 includeAllTables
ただし、これはXLSXおよびJSON形式でのみ機能します。 このオプションは、生成されたスプレッドシートブック内の新しいシートに各テーブルを配置します。
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.format = 'xlsx' options.includeAllTables = True grabzIt.URLToTable("https://www.tesla.com", options) # Then call the Save or SaveTo method grabzIt.SaveTo("result.xlsx")
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.format = 'xlsx' options.includeAllTables = True grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr> <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr> </table></body></html>", options) # Then call the Save or SaveTo method grabzIt.SaveTo("result.xlsx")
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.format = 'xlsx' options.includeAllTables = True grabzIt.FileToTable("tables.html", options) # Then call the Save or SaveTo method grabzIt.SaveTo("result.xlsx")
PythonとGrabzItのHTMLテーブル変換サービスを使用すると、HTMLテーブルを変換できます into JSON。 以下に示す最初のステップは、指定することです json
formatパラメーターで。 次に、JSONを取得します string 同期して SaveTo
メソッドを使用すると、Pythonのお気に入りのJSONパーサーを使用してJSONを変換できます string intオブジェクト。
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.format = "json" options.tableNumberToInclude = 1 grabzIt.URLToTable("https://www.tesla.com", options) json = grabzIt.SaveTo()
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.format = "json" options.tableNumberToInclude = 1 grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr> <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr> </table></body></html>", options) json = grabzIt.SaveTo()
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.format = "json" options.tableNumberToInclude = 1 grabzIt.FileToTable("tables.html", options) json = grabzIt.SaveTo()
にカスタム識別子を渡すことができます テーブル メソッドを以下に示すように、この値はGrabzIt Pythonハンドラーに返されます。 たとえば、このカスタム識別子はデータベース識別子であり、スクリーンショットを特定のデータベースレコードに関連付けることができます。
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.customId = "123456" grabzIt.URLToTable("https://www.tesla.com", options) # Then call the Save method grabzIt.Save("http://www.example.com/handler.py")
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.customId = "123456" grabzIt.HTMLToTable("<html><body><h1>Hello World!</h1></body></html>", options) # Then call the Save method grabzIt.Save("http://www.example.com/handler.py")
from GrabzIt import GrabzItTableOptions from GrabzIt import GrabzItClient grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret") options = GrabzItTableOptions.GrabzItTableOptions() options.customId = "123456" grabzIt.FileToTable("example.html", options) # Then call the Save method grabzIt.Save("http://www.example.com/handler.py")