<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question How to reduce RMSE(Root Mean Squred Error) value for linear regression in machine learning? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-reduce-RMSE-Root-Mean-Squred-Error-value-for-linear/m-p/133877#M43695</link>
    <description>&lt;P&gt;Hi Guys,&lt;/P&gt;&lt;P&gt;I am new to the machine learning course I have dataset of clinical trials.It contains some textual as well as numerical data both(I have converted all the textual data/features into numeric by using &lt;STRONG&gt;Divectorization&lt;/STRONG&gt; library of python).&lt;/P&gt;&lt;P&gt;I have attached dataset csv file as well as jupyter python notebook.Please check it.&lt;/P&gt;&lt;P&gt;if you want dataset description,then please visit below link and have used same public data from clinicaltrial.gov website.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;A href="https://clinicaltrials.gov/ct2/about-studies/glossary" target="_blank"&gt;https://clinicaltrials.gov/ct2/about-studies/glossary&lt;/A&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Problem Statement&lt;/STRONG&gt;:A dataset contains "ENROLLMENT" column(which shows number of participants required for clinical study) so,i want my algorithm should predict "ENROLLMENT" based on train data.&lt;/P&gt;&lt;P&gt;Please change the format from&lt;STRONG&gt; .txt&lt;/STRONG&gt; to&lt;STRONG&gt; .csv&lt;/STRONG&gt; for &lt;STRONG&gt;ct_gov_results&lt;/STRONG&gt;  and .&lt;STRONG&gt;txt &lt;/STRONG&gt;to &lt;STRONG&gt;.ipynb&lt;/STRONG&gt; for temporary_notebook file before you opens.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Issue: &lt;/STRONG&gt;I am getting &lt;STRONG&gt;RMSE&lt;/STRONG&gt; value as somewhat near to &lt;STRONG&gt;3000&lt;/STRONG&gt; which is not good value.As per my knowledge it's value  must be in between the range of &lt;STRONG&gt;0 and 1&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;I don't understand how to reduce it's value so that my algorithm will works fine for my data.&lt;/P&gt;&lt;P&gt;Please do response,Your reply will be very valuable for me.&lt;/P&gt;&lt;P&gt;Thanks in advance.&lt;/P&gt;</description>
    <pubDate>Mon, 17 Oct 2016 16:39:03 GMT</pubDate>
    <dc:creator>Manus</dc:creator>
    <dc:date>2016-10-17T16:39:03Z</dc:date>
    <item>
      <title>How to reduce RMSE(Root Mean Squred Error) value for linear regression in machine learning?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-reduce-RMSE-Root-Mean-Squred-Error-value-for-linear/m-p/133877#M43695</link>
      <description>&lt;P&gt;Hi Guys,&lt;/P&gt;&lt;P&gt;I am new to the machine learning course I have dataset of clinical trials.It contains some textual as well as numerical data both(I have converted all the textual data/features into numeric by using &lt;STRONG&gt;Divectorization&lt;/STRONG&gt; library of python).&lt;/P&gt;&lt;P&gt;I have attached dataset csv file as well as jupyter python notebook.Please check it.&lt;/P&gt;&lt;P&gt;if you want dataset description,then please visit below link and have used same public data from clinicaltrial.gov website.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;A href="https://clinicaltrials.gov/ct2/about-studies/glossary" target="_blank"&gt;https://clinicaltrials.gov/ct2/about-studies/glossary&lt;/A&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Problem Statement&lt;/STRONG&gt;:A dataset contains "ENROLLMENT" column(which shows number of participants required for clinical study) so,i want my algorithm should predict "ENROLLMENT" based on train data.&lt;/P&gt;&lt;P&gt;Please change the format from&lt;STRONG&gt; .txt&lt;/STRONG&gt; to&lt;STRONG&gt; .csv&lt;/STRONG&gt; for &lt;STRONG&gt;ct_gov_results&lt;/STRONG&gt;  and .&lt;STRONG&gt;txt &lt;/STRONG&gt;to &lt;STRONG&gt;.ipynb&lt;/STRONG&gt; for temporary_notebook file before you opens.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Issue: &lt;/STRONG&gt;I am getting &lt;STRONG&gt;RMSE&lt;/STRONG&gt; value as somewhat near to &lt;STRONG&gt;3000&lt;/STRONG&gt; which is not good value.As per my knowledge it's value  must be in between the range of &lt;STRONG&gt;0 and 1&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;I don't understand how to reduce it's value so that my algorithm will works fine for my data.&lt;/P&gt;&lt;P&gt;Please do response,Your reply will be very valuable for me.&lt;/P&gt;&lt;P&gt;Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Oct 2016 16:39:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-reduce-RMSE-Root-Mean-Squred-Error-value-for-linear/m-p/133877#M43695</guid>
      <dc:creator>Manus</dc:creator>
      <dc:date>2016-10-17T16:39:03Z</dc:date>
    </item>
    <item>
      <title>Re: How to reduce RMSE(Root Mean Squred Error) value for linear regression in machine learning?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-reduce-RMSE-Root-Mean-Squred-Error-value-for-linear/m-p/133878#M43696</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/10447/manoj-dhake.html" nodeid="10447"&gt;@Manoj Dhake&lt;/A&gt; , it depends on the dependent variable. The unit of RMSE is same as dependent variable. If your data has a range of 0 to 100000 then RMSE value of 3000 is small, but if the range goes from 0 to 1, it is pretty huge. Try to play with other input variables, and compare your RMSE values. The smaller the RMSE value, the better the model.&lt;/P&gt;&lt;P&gt;Also, try to compare your RMSE values of both training and testing data. If they are almost similar, your model is good. If the RMSE for the testing data is much higher than that of the training data, it is likely that you've badly over fit the data.&lt;/P&gt;</description>
      <pubDate>Wed, 09 Nov 2016 06:34:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-reduce-RMSE-Root-Mean-Squred-Error-value-for-linear/m-p/133878#M43696</guid>
      <dc:creator>mrizvi</dc:creator>
      <dc:date>2016-11-09T06:34:21Z</dc:date>
    </item>
    <item>
      <title>Re: How to reduce RMSE(Root Mean Squred Error) value for linear regression in machine learning?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-reduce-RMSE-Root-Mean-Squred-Error-value-for-linear/m-p/312061#M43697</link>
      <description>&lt;P&gt;&lt;SPAN&gt;"If your data has a range of 0 to 100000 then RMSE value of 3000 is small, but if the range goes from 0 to 1." Range going from 0 to 1 means?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2021 14:50:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-reduce-RMSE-Root-Mean-Squred-Error-value-for-linear/m-p/312061#M43697</guid>
      <dc:creator>Brajesh</dc:creator>
      <dc:date>2021-02-24T14:50:27Z</dc:date>
    </item>
  </channel>
</rss>

