Skip to content

[feature] data synchronization function , through datax#1881

Merged
qiaozhanwei merged 17 commits intoapache:devfrom
wen-hemin:dev-etl_node
Feb 6, 2020
Merged

[feature] data synchronization function , through datax#1881
qiaozhanwei merged 17 commits intoapache:devfrom
wen-hemin:dev-etl_node

Conversation

@wen-hemin
Copy link
Copy Markdown
Contributor

please see this issue #1856

@codecov-io
Copy link
Copy Markdown

codecov-io commented Feb 2, 2020

Codecov Report

❗ No coverage uploaded for pull request base (dev@ed5bc8c). Click here to learn what that means.
The diff coverage is 32%.

Impacted file tree graph

@@          Coverage Diff           @@
##             dev    #1881   +/-   ##
======================================
  Coverage       ?   24.11%           
======================================
  Files          ?      299           
  Lines          ?    14422           
  Branches       ?     2372           
======================================
  Hits           ?     3478           
  Misses         ?    10552           
  Partials       ?      392
Impacted Files Coverage Δ
.../dolphinscheduler/api/service/ExecutorService.java 31.87% <ø> (ø)
...lphinscheduler/server/worker/task/TaskManager.java 0% <ø> (ø)
...heduler/server/master/runner/MasterExecThread.java 13.3% <ø> (ø)
...inscheduler/common/task/datax/DataxParameters.java 0% <0%> (ø)
...he/dolphinscheduler/common/model/DateInterval.java 12.5% <0%> (ø)
...che/dolphinscheduler/api/utils/ZooKeeperState.java 0% <0%> (ø)
...ler/server/master/runner/MasterTaskExecThread.java 0% <0%> (ø)
...dolphinscheduler/common/queue/TaskQueueZkImpl.java 0% <0%> (ø)
...lphinscheduler/server/worker/task/sql/SqlTask.java 0% <0%> (ø)
...olphinscheduler/common/model/TaskNodeRelation.java 0% <0%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ed5bc8c...ec234ee. Read the comment docs.

@davidzollo
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@qiaozhanwei qiaozhanwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1,datax or sqoop is data import or export, what's mean ETL task naming ?
2,suggest remove class author,because other not

@wen-hemin
Copy link
Copy Markdown
Contributor Author

1,datax or sqoop is data import or export, what's mean ETL task naming ?
2,suggest remove class author,because other not

1,The front end is named ETL, and the inside is implemented by Datax. It can be replaced by other implementations.
2,fixed

@wen-hemin
Copy link
Copy Markdown
Contributor Author

/**
* datasource id
*/
private int datasource;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datasource -> dataSource

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, let me fix it

/**
* datatarget id
*/
private int datatarget;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datatarget -> dataTarget

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, let me fix it


@Override
public boolean checkParameters() {
if (!(datasource != 0 && StringUtils.isNotEmpty(dsType) && StringUtils.isNotEmpty(sql))) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether these two if condition are considered to be combined into one?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, let me fix it

@Eights-Li
Copy link
Copy Markdown
Contributor

1,datax or sqoop is data import or export, what's mean ETL task naming ?
2,suggest remove class author,because other not

1,The front end is named ETL, and the inside is implemented by Datax. It can be replaced by other implementations.
2,fixed

data sync as this task name is more suitable than etl?
Datax or sqoop should only be responsible for the import and export of data. Although it can take part in data process, the main data process should be implemented by hive-ql or spark.

@Jave-Chen
Copy link
Copy Markdown
Contributor

Jave-Chen commented Feb 3, 2020

New files UT coverage need add files at pom.xml at root. Like this:

 <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-surefire-plugin</artifactId>
    <version>${maven-surefire-plugin.version}</version>
    <configuration>
        <includes>
            <include>**/common/utils/*.java</include>
            <include>**/common/threadutils/*.java</include>
            <include>**/common/graph/*.java</include>
            <include>**/common/queue/*.java</include>
            <include>**/api/utils/CheckUtilsTest.java</include>
            <include>**/api/utils/FileUtilsTest.java</include>
            <include>**/api/utils/exportprocess/DataSourceParamTest.java</include>
            <include>**/api/utils/exportprocess/DependentParamTest.java</include>
            <include>**/api/enums/*.java</include>
            <include>**/api/service/AccessTokenServiceTest.java</include>
            <include>**/api/service/QueueServiceTest.java</include>
            <include>**/api/service/MonitorServiceTest.java</include>
            <include>**/api/service/SessionServiceTest.java</include>
            <include>**/api/service/UsersServiceTest.java</include>
            <include>**/api/service/TenantServiceTest.java</include>
            <include>**/api/service/WorkerGroupServiceTest.java</include>
            <include>**/api/service/AlertGroupServiceTest.java</include>
            <include>**/api/service/ProjectServiceTest.java</include>
            <include>**/api/service/ProcessDefinitionServiceTest.java</include>
            <include>**/api/service/UdfFuncServiceTest.java</include>
            <include>**/api/service/ResourcesServiceTest.java</include>
            <include>**/api/service/ExecutorService2Test.java</include>
            <include>**/api/service/BaseServiceTest.java</include>
            <include>**/api/service/BaseDAGServiceTest.java</include>
            <include>**/alert/utils/ExcelUtilsTest.java</include>
            <include>**/alert/utils/FuncUtilsTest.java</include>
            <include>**/alert/utils/JSONUtilsTest.java</include>
            <include>**/alert/utils/PropertyUtilsTest.java</include>
            <include>**/server/utils/SparkArgsUtilsTest.java</include>
            <include>**/server/utils/FlinkArgsUtilsTest.java</include>
            <include>**/server/utils/ParamUtilsTest.java</include>
            <include>**/server/utils/ScheduleUtilsTest.java</include>
            <include>**/server/master/MasterExecThreadTest.java</include>
            <include>**/dao/mapper/AccessTokenMapperTest.java</include>
            <include>**/dao/mapper/AlertGroupMapperTest.java</include>
            <include>**/dao/mapper/AlertMapperTest.java</include>
            <include>**/dao/mapper/CommandMapperTest.java</include>
            <include>**/alert/template/AlertTemplateFactoryTest.java</include>
            <include>**/alert/template/impl/DefaultHTMLTemplateTest.java</include>
            <!-- add here -->
            <include>**/server/worker/task/AbstractTask.java</include>
        </includes>
        <!-- <skip>true</skip> -->
    </configuration>
</plugin>

@wen-hemin
Copy link
Copy Markdown
Contributor Author

1,datax or sqoop is data import or export, what's mean ETL task naming ?
2,suggest remove class author,because other not

1,The front end is named ETL, and the inside is implemented by Datax. It can be replaced by other implementations.
2,fixed

data sync as this task name is more suitable than etl?
Datax or sqoop should only be responsible for the import and export of data. Although it can take part in data process, the main data process should be implemented by hive-ql or spark.

I think,A better name is DataX

String targetColumn = "`select`";

assertEquals(DataxUtils.doConvertKeywordsColumn(DbType.MYSQL, fromColumn), targetColumn);
assertEquals(DataxUtils.doConvertKeywordsColumn(DbType.MYSQL, " \"`select`\" "), "`select`");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertEquals first param is expected value, sceond param is actual value, please swap there params.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, I fixed


assertEquals(DataxUtils.doConvertKeywordsColumn(DbType.MYSQL, fromColumn), targetColumn);
assertEquals(DataxUtils.doConvertKeywordsColumn(DbType.MYSQL, " \"`select`\" "), "`select`");
assertEquals(DataxUtils.doConvertKeywordsColumn(DbType.POSTGRESQL, " \"`select`\" "), "\"select\"");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertEquals first param is expected value, sceond param is actual value, please swap there params, from 102-105 lines.


Assert.assertTrue(columns.length == 2);

Assert.assertEquals(Arrays.toString(columns), "[`a`, `table`]");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertEquals first param is expected value, sceond param is actual value, please swap there params.


Assert.assertTrue(columns.length == 2);

Assert.assertEquals(Arrays.toString(columns), "[a, b]");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertEquals first param is expected value, sceond param is actual value, please swap there params.

Assert.assertNotNull(reader);

String readerPluginName = (String) reader.get("name");
Assert.assertEquals(readerPluginName, DataxUtils.DATAX_READER_PLUGIN_MYSQL);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertEquals first param is expected value, sceond param is actual value, please swap there params.

Assert.assertNotNull(writer);

String writerPluginName = (String) writer.get("name");
Assert.assertEquals(writerPluginName, DataxUtils.DATAX_WRITER_PLUGIN_MYSQL);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertEquals first param is expected value, sceond param is actual value, please swap there params.

public void testNotNull()
throws Exception {
try {
Method method = DataxTask.class.getDeclaredMethod("notNull", Object.class, String.class);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this Test no assert?

@davidzollo
Copy link
Copy Markdown
Contributor

Come on

@wen-hemin
Copy link
Copy Markdown
Contributor Author

UT bug is fixed

Copy link
Copy Markdown
Contributor

@Eights-Li Eights-Li left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@Eights-Li Eights-Li requested a review from qiaozhanwei February 5, 2020 01:38
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Feb 5, 2020

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities (and Security Hotspot 0 Security Hotspots to review)
Code Smell A 18 Code Smells

69.3% 69.3% Coverage
0.2% 0.2% Duplication

Copy link
Copy Markdown
Contributor

@qiaozhanwei qiaozhanwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@qiaozhanwei qiaozhanwei merged commit f942e5c into apache:dev Feb 6, 2020
@wen-hemin wen-hemin deleted the dev-etl_node branch February 6, 2020 01:35
@davidzollo davidzollo changed the title [feature] data synchronization function [feature] data synchronization function , through datax Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants